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The present invention provides a purified protein 
designated P/CAF having a molecular weight of about 
93,000 daltons as determined by sodium dodecyl sul- 
fate polyacrylamide gel electrophoresis under reducing 
conditions and which acetylates histones and which also 
binds to the p300/CBP cellular protein. The present 
invention further provides a nucleic acid encoding the 
P/CAF protein as well as a vector containing the nu- 
cleic acid and a host for the vector. A purified antibody 
which specifically binds the P/CAF protein is also pro- 
vided. Also provided are methods of screening for com- 
pounds that inhibit or stimulate the transcription mod- 
ulating and histone acetyltransferase activity of P/CAF 
and p300/CBP. 
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P300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF AND USES THEREOF 



BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

The present invention provides a transcriptional co-factor, p300/CBP-associated 
factor (P/CAF), which modulates transcription through binding to the cellular 
transcription co-factors p300 and CBP and through acetylation of histones. Also 
10 provided are methods for screening for the presence of P/CAF and for substances which 
alter the transcription modulating effect and growth regulatory activity of P/CAF. 



Background Art 

Cellular proteins p300 and CBP are global transcriptional coactivators that are 
1 5 involved in the regulation of various DNA-binding transcriptional factors ( Janknecht and 
Hunter, 1996). Recently, p300 was found to be very closely related to CBP, a factor 
that binds selectively to the protein kinase A-phosphorylated form of CREB (3-5). 
Cellular factors p300 and CBP exhibit strong amino acid sequence similarity and share 
the capacity to bind both CREB and El A (6-8). Although neither p300 nor CBP by 
20 itself binds to DNA, each can be recruited to promoter elements via interaction with 
sequence-specific activators and functions to be a transcriptional adaptor. For 
simplicity, p300 and CBP will be termed p300/CBP in the context of discussing their 
shared functional properties. 

25 p300/CBP is a large protein consisting of over 2,400 amino acids, known to 

interact with a variety of DNA-binding transcriptional factors including nuclear hormone 
receptors (13,57), CREB (3,4, 7), c-Jun/v-Jun (9,1 1), YY1 (10), c-Myb/v-Myb (12,58), 
Sap- la (59), c-Fos (11) and MyoD (60). DNA-binding factors recruit p300/CBP not 
only by direct but also indirect interactions through cofactors; for example, nuclear 

30 hormone receptors recruit p300/CBP directly as well as through indirect interactions, via 
SRC-1, which stimulates transcription by binding to various nuclear hormone receptors 
(13,61). 
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The transforming proteins encoded by adenovirus and several other small DNA 
tumor viruses disturb host cell growth control by interacting with cellular factors that 
normally function to repress cell proliferation. One of the most intensively studied of 
these viral proteins, the product of the adenovirus El A gene, is itself sufficient for 
5 transformation ( 1 ). E 1 A transforming activity resides in two distinct domains, the 
targets of which include p300/CBP and products of the retinoblastoma (RB) 
susceptibility gene family (1,2). Interactions of El A with p300/CBP and RB are 
thought to influence functionally distinct growth regulatory pathways, allowing the two 
domains to contribute additively to transformation ( 1 ). 

10 

The paradigm for how El A and functionally related viral proteins perturb cell 
growth regulation derives in large part from studies on their interactions with RB (1,2) 
The molecular function of El A is based on its capacity to interfere with cellular protein- 
protein interactions. Since both El A and various cellular targets bind to a site in RB 
15 termed the pocket domain (2), El A can competitively disrupt the complex formation 
between RB and its cellular targets. 

The second cellular factor implicated in El A-dependent transformation, p300, is 
believed to inhibit G0/G1 exit, to activate certain enhancers, and to stimulate 
20 differentiation (1 ,2). El A inhibits the p300/CBP-mediated transcriptional activation of 
many promoters (14). In one case that has been examined, the complex of p300 and 
YY1, El A inhibits transcription without disrupting the complex (10). 

The present invention provides a cellular protein designated P/CAF which binds 
25 to p300/CBP and plays an important role in both transcription and cell cycle regulation 
associated with a histone acetyltransferase activity. The present invention also provides 
a histone acetyltransferase activity in the p300/CBP cellular protein, thus providing 
targets for modulating transcription and cell cycle regulation in cells. 



30 
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SUMMARY OF THE INVENTION 



The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
5 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones and which also binds to the p300/CBP cellular protein. 

The present invention further provides a nucleic acid encoding the P/CAF 
protein as well as a vector containing the nucleic acid and a host for the vector A 
10 purified antibody which specifically binds the P/CAF protein is also provided. 

In addition, also provided is a bioassay for screening substances for the ability to 
inhibit the transcription modulating activity of P/CAF and/or histone acetyltransferase 
activity, comprising contacting the substance with a system in which histone acetylation 

1 5 by P/CAF can be determined; determining the amount of histone acetylation by P/CAF 
in the presence of the substance; and comparing the amount of histone acetylation by 
P/CAF in the presence of the substance with the amount of histone acetylation by 
P/CAF in the absence of the substance, a decreased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit the 

20 transcription modulating activity and/or histone acetyltransferase activity of P/CAF 

Furthermore, the present invention provides a bioassay for screening substances 
for the ability to inhibit the transcription modulating activity and/or histone 
acetyltransferase activity of P/CAF comprising contacting the substance with a system in 

25 which the p300 binding of P/CAF can be determined; determining the amount of p300 
binding of P/CAF in the presence of the substance; and comparing the amount of p300 
binding of P/CAF in the presence of the substance with the amount of p300 binding of 
P/CAF in the absence of the substance, a decreased amount of p300 binding of P/CAF in 
the presence of the substance indicating a substance that can inhibit the transcription 

30 modulating activity and/or histone acetyltransferase activity of P/CAF. 
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Also provided is a method for determining the amount of P/CAF in a biological 
sample comprising contacting the biological sample with a polypeptide comprising the 
amino acid sequence of SEQ ID NO:3 under conditions whereby a P/CAF/p300 
complex can be formed; and determining the amount of the P/C AF/p300 complex, the 
5 amount of the complex indicating the amount of P/CAF in the sample. 

The present invention additionally provides a method for determining the amount 
of P/CAF in a biological sample comprising contacting the biological sample with an 
antibody which specifically binds P/CAF under conditions whereby a P/CAF/antibody 
10 complex can be formed; and determining the amount of the P/CAF/antibody complex, 
the amount of the complex indicating the amount of P/CAF in the sample. 

Also provided herein is an assay for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of P/CAF, comprising: contacting the 

1 5 substance with a system in which histone acetylation by P/CAF can be determined; 
determining the amount of histone acetylation by P/CAF in the presence of the 
substance; and comparing the amount of histone acetylation by P/CAF in the presence of 
the substance with the amount of histone acetylation by P/CAF in the absence of the 
substance, a decreased or increased amount of histone acetylation by P/CAF in the 

20 presence of the substance indicating a substance that can inhibit or stimulate, 
respectively, the histone acetyltransferase activity of P/CAF. 

The present invention further provides an assay for screening substances for the 
ability to inhibit binding of P/CAF to p300/CBP comprising: contacting the substance 

25 with a system in which the P/CAF binding of P300/CBP can be determined; determining 
the amount of P/CAF binding of p300/CB P in the presence of the substance, and 
comparing the amount of binding of P/CAF to p300/CBP in the presence of the 
substance with the amount of binding of P/CAF to P 300/CBP in the absence of the 
substance, a decreased amount of binding of P/C AF to p3 00/CBP in the presence of the 

30 substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 
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In addition, an assay is provided for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of p300/CBP, comprising: contacting 
the substance with a system in which histone acetylation by p300/CBP can be 
5 determined; determining the amount of histone acetylation by p300/CBP in the presence 
of the substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
10 stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 



Furthermore, the present invention provides an assay for screening substances 
for the ability to inhibit binding of a DNA-binding transcription factor to p300/CBP 
comprising: contacting the substance with a system in which the DNA-binding 

1 5 transcription factor binding of P300/CBP can be determined; determining the amount of 
DNA-binding transcription factor binding of p300/CBP in the presence of the substance; 
and comparing the amount of binding of DNA-binding transcription factor to p300/CBP 
in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 

20 binding of DNA-binding transcription factor to p300/CBP in the presence of the 

substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP 



A method is also provided for inhibiting the transcription modulating activity of 
25 P/CAF in a subject, comprising administering to the subject a transcription modulating 
activity inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 



Also provided in the present invention is a method for stimulating the 
transcription modulating activity of P/CAF in a subject, comprising administering to the 
30 subject a transcription modulating activity stimulating amount of a substance in a 
pharmaceutically acceptable carrier. 
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Furthermore, the present invention provides a method for inhibiting the histone 
acetyltransferase activity of p300/CBP in a subject, comprising administering to the 
subject a histone acetyltransferase activity inhibiting amount of a substance in a 
5 pharmaceutical^ acceptable carrier. 

Finally, the present invention additionally provides a method for stimulating the 
histone acetyltransferase activity of p300/CBP in a subject, comprising administering to 
the subject a histone acetyltransferase activity stimulating amount of a substance in a 
10 pharmaceutical^ acceptable carrier. 



BRIEF DESCRIPTION OF THE FIGURES 



Figs. 1 A-B. Fig 1 A: P/CAF-p300/CBP interaction in vivo. Cell extract was 
1 5 immunoprecipitated with rabbit anti-P/C AF (lanes 1 , 4, and 7), rabbit anti-CBP (lanes 2 
and 5), and mouse anti-p300 (lane 9) antibodies. For controls, cell extract was 
precipitated with rabbit control IgG (lanes 3, 6, and 8) or mouse anti-HA monoclonal 
antibody (lane 10). The precipitates were analyzed by immunoblotting with anti-P/CAF 
(lanes 1-3), anti-CBP (lanes 4-6), and anti-p300 (lanes 7-10) antibodies. The positions 
20 of non-specific bands are indicated by asterisks. Fig. IB: El A inhibits the P/CAF-p300 
interaction in vivo. Osteosarcoma cells were transfected with either control vector 
(lanes 1 and 4) or El A- (lanes 2 and 5) or El AAN- (lanes 3 and 6) expression vectors. 
Extract from the transfected subpopulation was immunoprecipitated with anti-P/CAF 
(lanes 1-3) or control (lanes 4-6) IgG. The precipitates were analyzed by 
25 immunoblotting with anti-p300 and anti-P/CAF. 

Figs. 2A-F. P/CAF and El A mediate antagonistic effects on cell cycle 
progression. HeLa cells (ATCC accession number CCL 2) were transfected by 
electroporation with 7 ^g of P/CAF-expression plasmid and/or 3 ^g of the full-length or 
30 the ^-terminally deleted (A2-36) E1A 12S-expression plasmid as indicated in the figure 
These plasmids were constructed by subcloning FLAG-P/CAF and El A cDNAs into 
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pCX (34) and pcDNAI (Invitrogen), respectively. All samples, in addition, contained 1 
Aig of sorting plasmid (pCMV-IL2R) (31) and carrier plasmid (pCX) to normalize the 
total amount of DNA to 1 1 y.g. After transfection, cells were incubated in Dulbecco's 
modified Eagle's medium with 10% fetal bovine calf serum for 12 hours and 
5 subsequently labeled in medium containing 10 /zM bromo-deoxyuridine (BrdU) for 30 
min. Subsequently, the transfected subpopulation was purified by magnetic affinity cell 
sorting and nuclei were analyzed by dual parameter flow cytometry as described (32) 
Histograms show percentages of cells in Gl and S phases. Abscissa values represent 
fluorescence intensity of bound anti-BrdU antibodies in log scale. 

10 

Fig. 3. Histone acetyltransferase activity of P/CAF. Activity of hGCN5 (lanes 1 
and 4) and P/CAF (lanes 2 and 5) that acetylates free histones (lanes 1-3) or histones in 
the nucleosome core particle (35) (lanes 4-6) was measured as described (36). Each 
reaction contains 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol 
1 5 of the histone octamer or the nucleosome core particle and 1 0 pmol of [ 1 - u C]acetyl- 
CoA. Note that the histone octamer dissociates into dimers or tetramers under assay 
conditions. Acetylated histones were detected by autoradiography after separation by 
SDS-PAGE. The bands corresponding to acetylated histones H3 and H4 are indicated 
by arrows, 

20 

DETAILED DESCRIPTION OF THE INVENTION 

As used in the specification and in the claims, "a" can mean one or more, 
depending upon the context in which it is used. 

25 

P/CAF protein and fragments 

The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
30 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones. The P/CAF protein can also bind to the amino acid region of SEQ ED NO:3 
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(amino acid (aa) residues 1753 - 1966) of the cellular transcriptional factor, p300 (which 
has the complete amino acid sequence of SEQ ID NO:6 and the nucleotide sequence of 
SEQ ID NO: 12), and the amino acid region of SEQ ID NO:6 (amino acid residues 1805 
- 1854) of the cellular transcriptional factor, CBP (which has the complete amino acid 

5 sequence of SEQ ID NO:7 and the nucleotide sequence of SEQ ID NO: 13) The 
P/CAF protein can be defined by any one or more of the typically used parameters 
Examples of these parameters include, but are not limited to molecular weight 
(calculated or empirically determined), isoelectric focusing point, specific epitope(s), 
complete amino acid sequence, sequence of a specific region (e g., N-terminus) of the 

10 amino acid sequence and the like. 

For example, The P/CAF protein can consist of the amino acid sequence of SEQ 
ID NO: 1 or the P/CAF protein can comprise the amino acid sequence of SEQ ID NO: 2 
which represents the carboxy terminal end of the P/CAF protein and contains the histone 
15 acetyltransferase activity, or the amino acid sequence of SEQ ID NO:4, which 

represents the amino terminal end of the P/CAF protein, containing the binding site for 
p300/CBP. Because the amino-terminal region is specific for P/CAF it can be used to 
define and identify P/CAF 

20 As used herein, "purified" refers to a protein (polypeptide, peptide, etc.) that is 

sufficiently free of contaminants or cell components with which it normally occurs to 
distinguish it from the contaminants or other components of its natural environment. 
The purified protein need not be homogeneous, but must be sufficiently free of 
contaminants to be useful in a clinical or research setting, for example, in an assay for 

25 detecting antibodies to the protein. Greater levels of purity can be obtained using 
methods derived from well known protocols. Specific methods for purifying P/CAF 
proteins are known in the art. 

As will be appreciated by those skilled in the art, the invention also includes 
30 those P/CAF polypeptides having slight variations in amino acid sequence which yield 
polypeptides equivalent to the P/CAF protein defined herein. Such variations may arise 
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naturally as allelic variations {e.g., due to genetic polymorphism) or may be produced by 
human intervention (e.g., by mutagenesis of cloned DNA sequences), such as induced 
point, deletion, insertion and substitution mutants. Minor changes in amino acid 
sequence are generally preferred, such as conservative amino acid replacements, small 
5 internal deletions or insertions, and additions or deletions at the ends of the molecules. 
Substitutions may be designed based on, for example, the model of DayhofF, et al. (37). 
These modifications can result in changes in the amino acid sequence, provide silent 
mutations, modify a restriction site, or provide other specific mutations. 

10 Modifications to any of the P/CAF proteins or fragments can be made, while 

preserving the specificity and activity (function) of the native protein or fragment 
thereof As used herein, "native" describes a protein that occurs in nature. The 
modifications contemplated herein can be conservative amino acid substitutions, for 
example, the substitution of a basic amino acid for a different basic amino acid. 

1 5 Modifications can also include creation of fusion proteins with epitope tags or known 
recombinant proteins or genes encoding them created by subcloning into commercial or 
non-commercial vectors (e.g., polyhistidine tags, flag tags, myc tag, glutathione- S- 
transferase [GST] fusion protein, xylE fusion reporter construct). Furthermore, the 
modifications can be such as do not affect the function of the protein or the way the 

20 protein accomplishes that function (e.g., its secondary structure or the ultimate result of 
the protein's activity). These products are equivalent to the P/CAF protein. The means 
for determining the function, way and result parameters are well known. 

Having provided an example of a purified P/CAF protein, the invention also 
25 enables the purification of P/CAF homologs from other species and allelic variants from 
individuals within a species, For example, an antibody raised against the exemplary 
human P/CAF protein can be used routinely to screen preparations from different 
humans for allelic variants of the P/CAF protein that react with the P/CAF protein- 
specific antibody. Similarly, an antibody raised against an epitope, for example, from a 
30 conserved amino acid region of the human P/CAF protein can be used to routinely 
screen for homologs of the P/CAF protein in other species. A P/CAF protein can be 
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routinely identified in and obtained from other species and from individuals within a 
species using the methods taught herein and others known in the art. For example, 
given the present sequence, the DNA encoding a conserved amino acid sequence can be 
used to probe genomic DNA or DNA libraries of an organism to predictably obtain the 
5 P/CAF gene for that organism. The gene can then be cloned and expressed as the 
P/CAF protein and purified according to any of a number of routine, predictable 
methods! An example of the routine protein purification methods available in the art can 
be found in Pei et al. (38). 

10 A purified polypeptide fragment of the P/CAF protein is also provided. The 

term "fragment" as used herein regarding a P/CAF protein, means a molecule of at least 
five contiguous amino acids of P/CAF protein that has at least one function shared by 
P/CAF protein or a region thereof. These functions can include antigenicity, binding 
capacity, acetyltransferase activity and structural roles, among others. The P/CAF 

1 5 fragment can be specific for a recited source. As used herein to describe an amino acid 
sequence (protein, polypeptide, peptide, etc.), "specific" means that the amino acid 
sequence is not found identically in any other source. The determination of specificity is 
made routine by the availability of computerized amino acid sequence databases and 
sequence comparison programs, wherein an amino acid sequence of almost any length 

20 can be quickly and reliably checked for the existence of identical sequences. If an 
identical sequence is not found, the protein is "specific" for the recited source. For 
example, a P/CAF fragment can be species- specific (e.g., found in the P/CAF protein of 
humans, but not of other species). 

25 A fragment of the P/CAF protein having histone acetyltransferase activity can 

consist of the amino acid sequence of SEQ ID NO:2 A fragment of the P/CAF protein 
which binds to the amino acid sequence of SEQ ID NO:3 on p300 and the amino acid 
sequence of SEQ ID NO 9 on CBP can consist of the amino acid sequence of SEQ ID 
NO:4. To the extent that these fragments are specific for P/CAF, they can be used to 

30 identify and define P/CAF 
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An antigenic fragment of P/CAF protein is provided. An antigenic fragment has 
an amino acid sequence of at least about five consecutive amino acids of a P/CAF 
protein amino acid sequence and binds an antibody or elicits an immune response in an 
animal. An antigenic fragment can be selected by applying the routine technique of 
5 epitope mapping to P/CAF protein to determine the regions of the proteins that contain 
epitopes reactive with antibodies or are capable of eliciting an immune response in an 
animal Once the epitope is selected, an antigenic polypeptide containing the epitope 
can be synthesized directly, or produced recombinantly by cloning nucleic acids 
encoding the antigenic polypeptide in an expression system, according to standard 
10 methods. 

Alternatively, an antigenic fragment of the antigen can be isolated from the 
whole P/CAF protein or a larger fragment of the P/CAF protein by chemical or 
mechanical disruption. Fragments can also be randomly chosen from a known P/CAF 
1 5 protein sequence and synthesized. The purified fragments thus obtained can be tested to 
determine their antigenicity and specificity by routine methods. 

Nucleic Acids Encoding P/CAF Protein v 

An isolated nucleic acid that encodes a P/CAF protein is also provided As used 
20 herein, the term "isolated" means a nucleic acid separated or substantially free from at 
least some of the other components of the naturally occurring organism, for example, 
the cell structural components commonly found associated with nucleic acids in a 
cellular environment and/or other nucleic acids. The isolation of nucleic acids can 
therefore be accomplished by techniques such as cell lysis followed by phenol plus 
25 chloroform extraction, followed by ethanol precipitation of the nucleic acids (39). It is 
not contemplated that the isolated nucleic acids are necessarily totally free of all non- 
nucleic acid components or all other nucleic acids, but that the isolated nucleic acids are 
isolated to a degree of purification to be useful in clinical, diagnostic, experimental, or 
other procedures such as, for example, gel electrophoresis, Southern, Northern or dot 
30 blot hybridization, or polymerase chain reaction (PCR), 
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A skilled artisan in the field will readily appreciate that there are a multitude of 
procedures which may be used to isolate the nucleic acids prior to their use in other 
procedures. These include, but are not limited to, lysis of the cell followed by gel 
filtration or anion exchange chromatography, binding DN A to silica in the form of glass 
5 beads, filters or diatoms in the presence of high concentrations of chaotropic salts, or 
ethanol precipitation of the nucleic acids 

The nucleic acids of the present invention can include positive and negative 
strand RNA as well as DNA and can include genomic and subgenomic nucleic acids 

10 found in the naturally occurring organism. The nucleic acids contemplated by the 
present invention include double stranded and single stranded DNA of the genome, 
complementary positive stranded cRNA and mRNA, and complementary cDNA 
produced therefrom and any nucleic acid which can selectively or specifically hybridize 
to the isolated nucleic acids provided herein. Stringent conditions (further described 

15 below) are used to distinguish selectively or specifically hybridizing nucleic acids from 
non-selectively and non-specifically hybridizing nucleic acids. 

An isolated nucleic acid that encodes a P/CAF protein can be species-specific 
(i.e., does not encode the P/CAF protein of other species and does not occur in other 
20 species). Examples of the nucleic acids contemplated herein include the nucleic acid of 
SEQ ID NO: 10 as well as the nucleic acids that encode each of the P/CAF proteins or 
fragments thereof described herein. P/CAF proteins and protein fragments can be 
routinely obtained as described herein and their structure (sequence) determined by 
routine means including the methods as used herein. 

25 

P/CAF protein-encoding nucleic acids can be isolated from an organism in which 
they are normally found (e.g., humans), using any of the routine techniques. For 
example, a genomic DNA or cDNA library can be constructed and screened for the 
presence of the nucleic acid of interest using one of the present P/CAF protein-encoding 
30 nucleic acids as a probe. Methods of constructing and screening such libraries are well 
known in the art and kits for performing the construction and screening steps are 
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commercially available (for example, Stratagene Cloning Systems, La Jolla, CA). Once 
isolated, the nucleic acid can be directly cloned into an appropriate vector, or if 
necessary, be modified to facilitate the subsequent cloning steps. Such modification 
steps are routine, an example of which is the addition of oligonucleotide linkers, which 
5 contain restriction sites, to the termini of the nucleic acid (See, for example, ref 39). 

P/CAF protein-encoding nucleic acids can also be synthesized. For example, a 
method of obtaining a DNA molecule encoding a specific P/CAF protein is to synthesize 
a recombinant DNA molecule which encodes the P/CAF protein. For example, nucleic 

10 acid synthesis procedures are routine in the art and oligonucleotides coding for a 

particular protein region are readily obtainable through automated DNA synthesis. A 
nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 

15 5* or 3' overhangs at the termini for cloning into an appropriate vector. 

Oligonucleotides complementary to or identical with the P/CAF protein- 
encoding nucleic acid sequence can be synthesized as primers for amplification 
reactions, such as PCR, or as probes to detect P/CAF protein encoding nucleic acids by 
20 various hybridization protocols (e.g., Northern blot; Southern blot; dot blot, colony 
screening, etc.). 

Double-stranded molecules coding for relatively large proteins can readily be 
synthesized by first constructing several different double-stranded molecules that code 

25 for particular regions of the protein, followed by ligating these DNA molecules together. 
For example, Cunningham, et al. (40), have constructed a synthetic gene encoding the 
human growth hormone by first constructing overlapping and complementary synthetic 
oligonucleotides and ligating these fragments together. See also, Ferretti, et al. (41), 
wherein synthesis of a 1057 base pair synthetic bovine rhodopsin gene from synthetic 

30 oligonucleotides is disclosed. 
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By constructing a P/C AF protein-encoding nucleic acid in this manner, one 
skilled in the an can readily obtain any particular P/CAF protein with modifications at 
any particular position or positions. See also, U.S. Patent No. 5,503,995 which 
describes an enzyme template reaction method of making synthetic genes Techniques 
5 such as this are routine in the art and are well documented. DNA encoding the P/CAF 
protein or P/CAF protein fragments can then be expressed in vivo or in vitro. 

The nucleic acid encoding the P/CAF protein can be any nucleic acid that 
functionally encodes the P/CAF protein. To functionally encode the protein (i.e., allow 

10 the nucleic acid to be expressed), the nucleic acid can include, but is not limited to, 
expression control sequences, such as an origin of replication, a promoter, regions 
upstream or downstream of the promoter, such as enhancers that may regulate the 
transcriptional activity of the promoter, appropriate restriction sites to facilitate cloning 
of inserts adjacent to the promoter, antibiotic resistance genes or other markers which 

1 5 can serve to select for cells containing the vector or the vector containing the insert, and 
necessary information processing sites, such as ribosome binding sites, RNA splice sites, 
polyadenylation sites and transcription termination sequences as well as any other 
sequence which may facilitate the expression of the inserted nucleic acid. 

20 Preferred expression control sequences are promoters derived from 

metallothionine genes, actin genes, immunoglobulin genes, CMV, S V40, adenovirus, 
bovine papilloma virus, etc. A nucleic acid encoding a P/CAF protein can readily be 
determined based upon the genetic code for the amino acid sequence of the P/CAF 
protein and many nucleic acid sequences will encode a P/CAF protein. Modifications in 

25 the nucleic acid sequence encoding the P/CAF protein are also contemplated. 
Modifications that can be useful are modifications to the sequences controlling 
expression of the P/CAF protein to make production of P/CAF protein inducible or 
represstble as controlled by the appropriate inducer or repressor. Such means are 
standard in the art {see, e.g., ref. 39). The nucleic acids can be generated by means 

30 standard in the art, such as by recombinant nucleic acid techniques, as exemplified in the 
examples herein, and by synthetic nucleic acid synthesis or in vitro enzymatic synthesis 
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After a nucleic acid encoding a particular P/C AF protein of interest, or a region 
of that nucleic acid, is constructed, modified, or isolated, that nucleic acid can then be 
cloned into an appropriate vector, which can direct the in vivo or in vitro synthesis of 
that wild-type and/or modified P/CAF protein. The vector is contemplated to have the 
5 necessary functional elements that direct and regulate transcription of the inserted 
nucleic acid, as described above. The vector containing the P/CAF nucleic acid or 
nucleic acid fragment can be in a host (e.g., cell or transgenic animal) for expressing the 
nucleic acid. The P/CAF protein or fragment thereof can thus be produced in a host 
system containing the expression vector and its functional activity as described herein 
10 can be demonstrated according to methods well known in the art. 

There are numerous E. coli {Escherichia colt) expression vectors known to one 
of ordinary skill in the art useful for the expression of proteins. Other microbial hosts 
suitable for use include bacilli, such as Bacillus subtilis, and other enterobacteria, such 

15 as Salmonella, Serratia, as well as various Pseudomonas species. These prokaryotic 
hosts can support expression vectors which will typically contain expression control 
sequences compatible with the host cell (e.g., an origin of replication). In addition, any 
number of a variety of well-known promoters will be present, such as the lactose 
promoter system, a tryptophan (Trp) promoter system, a beta-lactamase promoter 

20 system, or a promoter system from phage lambda. The promoters will typically control 
expression, optionally with an operator sequence and have ribosome binding site 
sequences, for example, for initiating and completing transcription and translation. If 
necessary, an amino terminal methionine can be provided by insertion of a Met codon 5' 
and in-frame with the gene sequence. Also, the carboxy-terminal extension of the 

25 protein can be removed using standard oligonucleotide mutagenesis procedures. 

Additionally, yeast expression can be used. There are several advantages to 
yeast expression systems. First, evidence exists that proteins produced in yeast secretion 
systems exhibit correct disulfide pairing. Second, post-translational glycosylation is 
30 efficiently carried out by yeast secretory systems The Saccharomyces cerevisiae pre- 
pro-alpha-factor leader region (encoded by the MFa-1 gene) is routinely used to direct 
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protein secretion from yeast (42). The leader region of pre-pro-alpha-factor contains a 
signal peptide and a pro-segment which includes a recognition sequence for a yeast 
protease encoded by the KEX2 gene. This enzyme cleaves the precursor protein on the 
carboxyl side of a Lys-Arg dipeptide cleavage-signal sequence. The polypeptide coding 

5 sequence can be fused in-frame to the pre-pro-alpha-factor leader region. This construct 
is then put under the control of a strong transcription promoter, such as the alcohol 
dehydrogenase I promoter or a glycolytic promoter. The protein coding sequence is 
followed by a translation termination codon which is followed by transcription 
termination signals. Alternatively, the polypeptide encoding sequence of interest can be 

10 fused to a second protein coding sequence, such as Sj26 or P-galactosidase, used to 
facilitate purification of the resultant fusion protein by affinity chromatography The 
insertion of protease cleavage sites to separate the components of the fusion protein is 
applicable to constructs used for expression in yeast. 

1 5 Efficient post-translational glycosylation and expression of recombinant proteins 

can also be achieved in Baculovirus expression systems in insect cells. 

Mammalian cells permit the expression of proteins in an environment that favors 
important post-translational modifications such as folding and cysteine pairing, addition 
of complex carbohydrate structures and secretion of active protein Vectors useful for 
the expression of proteins in mammalian cells are characterized by insertion of the 
protein encoding sequence between a strong viral promoter and a polyadenylation 
signal. The vectors can contain genes conferring either gentamicin or methotrexate 
resistance for use as selectable markers. For example, the antigen and immunoreactive 
fragment coding sequence can be introduced into a Chinese hamster ovary (CHO) cell 
line using a methotrexate resistance-encoding vector. Presence of the vector RNA in 
transformed cells can be confirmed by Northern blot analysis and production of a cDNA 
or opposite strand RNA corresponding to the protein encoding sequence can be 
confirmed by Southern and Northern blot analysis, respectively. A number of other 
suitable host cell lines capable of secreting intact proteins have been developed in the art 
and include the CHO cell lines, HeLa cells, myeloma cell lines, Jurkat cells, and the like 
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Expression vectors for these cells can include expression control sequences, as described 
above. The vectors containing the nucleic acid sequences of interest can be transferred 
into the host cell by well-known methods, which vary depending on the type of cell host. 
For example, calcium chloride transfection is commonly utilized for prokaryotic cells, 
5 whereas calcium phosphate treatment or electroporation may be used for other cell 
hosts. 

Alternative vectors for the expression of protein in mammalian cells, similar to 
those developed for the expression of human gamma-interferon, tissue plasminogen 
1 0 activator, clotting Factor VIII, hepatitis B virus surface antigen, protease Nexin 1, and 
eosinophil major basic protein, can be employed. Further, the vector can include CMV 
promoter sequences and a polyadenylation signal available for expression of inserted 
nucleic acid in mammalian cells (such as COS7). 

1 5 The nucleic acid sequences can be expressed in hosts after the sequences have 

been positioned to ensure the functioning of an expression control sequence. These 
expression vectors are typically replicable in the host organisms either as episomes or as 
an integral part of the host chromosomal DNA. Commonly, expression vectors can 
contain selection markers, e.g., tetracycline resistance or hygromycin resistance, to 

20 permit detection and/or selection of those cells transformed with the desired nucleic acid 
sequences (see, e.g., U.S. Patent 4,704,362). 

The nucleic acids produced as described above can also be expressed in a host 
which is a non-human animal to create a transgenic animal, containing, in a germ or 

25 somatic cell, a nucleic acid comprising the coding sequence for all or a portion of the 
P/CAF protein, as well as all of the other regulatory elements required for expression of 
the P/CAF protein-encoding sequence. The animal will express the P/CAF gene or 
portion thereof to produce the P/CAF protein or protein fragment and such expression 
can be detected by determination of a particular phenotype unique to the transgenic 

30 animal expressing the transferred nucleic acid. 



WO 98/03652 PCT/US97/12877 

18 

The nucleic acid can be the nucleic acid of SEQ ED NO: 10, a nucleic acid having 
a nucleotide sequence which encodes, the P/CAF protein, a nucleic acid having a 
nucleotide sequence which encodes the protein of SEQ ID NO: 1 , as well as the nucleic 
acids that encode the proteins comprising the fragments of SEQ ID NOS 2 and 4 

5 

The nucleic acids of the invention can contain substitutions or deletions which 
provide a particular phenotype of interest. For example, various deletions or base 
substitutions can be introduced into the nucleic acid encoding the P/CAF protein for the 
purpose of studying the effects of these particular deletions or substitutions on the 

10 transcription modulation activity of the P/CAF protein. These effects can be monitored 
by observation of such characteristics as growth and development of the animal, the 
ability to develop tumors, survival rates and the like. The gene construct introduced 
into the animal cells to produce the transgenic animal can contain any of the regulatory 
elements described above to modulate expression of the foreign genes. As used herein, 

1 5 the term "phenotype" includes morphology, biochemical profiles, changes in tumor 
formation and other parameters that are affected by the presence of the P/CAF protein 

The transgenic animals of the invention can also be used in a method for 
determining the effectiveness of administering a nucleic acid encoding a functional 

20 P/CAF protein to a subject in need of a functional P/CAF protein. First, a nucleic acid 
encoding a nonfunctional P/CAF protein can be introduced into the animal's cells and 
expressed to yield a characteristic phenotype Then, using standard gene therapy 
techniques, a nucleic acid encoding a functional P/CAF protein can be introduced into 
the animal's cells and the effects on the animal's phenotypic characteristics can be 

25 determined. 

Having provided and taught how to obtain a nucleic acid that encodes a P/CAF 
protein, an isolated nucleic acid that encodes a fragment of P/CAF protein is also 
provided. The nucleic acid encoding the fragment can be obtained using any of the 
30 methods applicable to the nucleic acid encoding the entire P/CAF protein. The nucleic 
acid fragment can encode a species-specific P/CAF protein fragment (e.g., found in the 
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P/CAF protein of humans, but not in the P/CAF proteins of other species). Nucleic 
acids encoding species-specific fragments of P/CAF protein are themselves species- 
specific or allele-specific fragments of the P/CAF gene. 



5 Examples of fragments of a nucleic acid encoding a fragment of the P/CAF 

protein can include the nucleic acid sequences which encode the amino acid sequences 
of the fragments of SEQ ID NOS:2 or 4. The same routine computer analyses used to 
select these examples of fragments can be routinely used to obtain others Fragments of 
P/CAF-encoding nucleic acids can be primers for PCR or probes, which can be species- 
10 specific, gene-specific or allele-specific. P/CAF-encoding nucleic acid fragments can 
encode antigenic or immunogenic fragments of P/CAF protein that can be used in 
therapeutic assays or screening protocols. P/CAF gene fragments can encode fragments 
of P/CAF protein having histone acetylase activity and/or p300/CBP binding activity as 
described above, as well as other uses that may become apparent. 

15 

An isolated nucleic acid of at least ten nucleotides that selectively hybridizes with 
the nucleic acid of SEQ ID NO: 10 under selected conditions is provided For example, 
the conditions can be PCR amplification conditions and the hybridizing nucleic acid can 
be a primer consisting of a specific fragment of the reference sequence or a nearly 
20 identical nucleic acid that hybridizes only to the exemplified P/CAF-encoding nucleic 
acid or allelic variants thereof. 

The invention provides an isolated nucleic acid that selectively hybridizes with 
the P/CAF-encoding nucleic acid sequence of SEQ ID NO: 10 under stringent 

25 conditions. The hybridizing nucleic acid can be a probe that hybridizes only to the 

exemplified P/CAF-encoding nucleic acid sequence. Thus, the hybridizing nucleic acid 
can be a naturally occurring species-specific allelic variant of the exemplified P/CAF 
gene. The hybridizing nucleic acid can also include insubstantial base substitutions that 
do not prevent hybridization under the stated stringent conditions or affect either the 

30 function of the encoded protein, the way the protein accomplishes that function (e.g., its 
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secondary structure) or the ultimate result of the protein's activity. The means for 
determining these parameters are well known. 

As used herein to describe nucleic acids, the term "selectively hybridizes" 
5 excludes the occasional randomly hybridizing nucleic acids as well as nucleic acids that 
encode other known homologs of the P/CAF protein. The selectively hybridizing 
nucleic acids of the invention can have at least 70%, 73%, 78%, 80%, 85%, 88%, 90%, 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% complementarity with the 
segment and strand of the sequence to which it hybridizes. This list is not intended to 
10 exclude percent complementarity values between these values The nucleic acids can be 
at least 10, 15, 16, 17, 18, 20, 21, 23, 24, 25, 30, 35, 40, 50, 100, 150, 200, 300, 500, 
550, 750, 900, 950, or 1000 nucleotides in length or any intervening length, depending 
on whether the nucleic acid is to be used as a primer, probe or for protein expression. 
The hybridizing nucleic acid can comprise a region of at least ten nucleotides (up to full 
15 length) that is completely complementary to a unique region of the nucleic acid to which 
it hybridizes. 

The nucleic acid can be an alternative coding sequence for the P/CAF protein, or 
can be used as a probe or primer for detecting the presence of or obtaining the P/CAF 
20 protein. If used as primers, the invention provides compositions including at least two 
nucleic acids which selectively hybridize with different regions of the nucleic acid so as 
to amplify a desired region. Depending on the length of the probe or primer, it can 
range between 70% complementary bases and full complementarity and still hybridize 
under stringent conditions. 

25 

For example, for the purpose of obtaining or determining the presence of a 
nucleic acid encoding the P/CAF protein, the degree of complementarity between the 
hybridizing nucleic acid (probe or primer) and the sequence to which it hybridizes 
(P/CAF DN A in a sample) should be at least enough to exclude hybridization with a 
30 nucleic acid from another species. The invention provides examples of these nucleic 

acids of P/CAF, so that the degree of complementarity required to distinguish selectively 
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hybridizing from nonselectively hybridizing nucleic acids under stringent conditions can 
be clearly determined for each nucleic acid. It should also be clear that the hybridizing 
nucleic acids of the invention will not hybridize with nucleic acids encoding unrelated 
proteins (hybridization is selective) under stringent conditions. 

5 

"Stringent conditions" refers to the washing conditions used in a hybridization 
protocol. In general, the washing conditions should be a combination of temperature 
and salt concentration chosen so that the denaturation temperature is approximately 5- 
20 °C below the calculated T m of the nucleic acid hybrid under study The temperature 

10 and salt conditions are readily determined empirically in preliminary experiments in 
which samples of reference DNA immobilized on filters are hybridized to the probe or 
protein encoding nucleic acid of interest and then washed under conditions of different 
stringencies. For example, the nucleic acid sequence of SEQ ID NO: 10 was used as a 
specific radiolabeled probe for the detection of messenger RNA transcribed from the 

1 5 P/CAF gene by performing hybridizations under stringent conditions The T m of such an 
oligonucleotide can be estimated by allowing 2°C for each A or T nucleotide, and 4°C 
for each G or C. For example, an 18 nucleotide probe of 50% G+C would, therefore, 
have an approximate T m of 54 °C. 

20 The invention provides an isolated nucleic acid that selectively hybridizes with 

the P/CAF gene shown in the sequence set forth as SEQ ID NO: 10 under stringent 
conditions. The invention further provides an isolated nucleic acid complementary to 
the nucleotide sequence set forth in SEQ ID NO: 10. 

25 Antibodies to the P/CAF protein 

A purified antibody and an antiserum containing polyclonal antibodies that 
specifically bind the P/CAF protein or antigenic fragment are also provided. The term 
"bind" means the well understood antigen/antibody binding as well as other nonrandom 
association with an antigen. "Specifically bind" as used herein describes an antibody or 

30 other ligand that does not cross react substantially with any antigen other than the one 
specified, in this case, an antigen of the P/CAF protein. Antibodies can be made as 
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described in Harlow and Lane (33). Briefly, purified P/CAF protein or an antigenic 
fragment thereof can be injected into an animal in an amount and in intervals sufficient to 
elicit a humoral immune response. Serum polyclonal antibodies can be purified directly, 
or spleen cells from the animal can be fused with an immortal cell line and screened for 
5 monoclonal antibody secretion, according to procedures well known in the art. Purified 
monospecific polyclonal antibodies that specifically bind the P/CAF antigen are also 
within the scope of the present invention The antibodies of the present invention can 
bind the protein of claim 1, the protein of claim 2, the protein of claim 3 and/or the 
protein of claim 4, as well as any other proteins of the present invention. 

10 

A ligand that specifically binds the antigen is also contemplated. The ligand can 
be a fragment of an antibody, such as , for example, an Fab fragment which retains 
P/CAF binding activity, or a smaller molecule designed to bind an epitope of the P/CAF 
antigen. The antibody or ligand can be bound to a substrate or labeled with a detectable 
15 moiety or both bound and labeled. The detectable moieties contemplated within the 

compositions of the present invention include those listed above in the description of the 
diagnostic methods, including fluorescent, enzymatic and radioactive markers. 

The antibody can be bound to a solid support substrate or conjugated with a 
20 detectable moiety or therapeutic compound or both bound and conjugated Such 
conjugation techniques are well known in the art. For example, conjugation of 
fluorescent, radioactive or enzymatic moieties can be performed as described in the art 
(33,43). The detectable moieties contemplated in the present invention can include 
fluorescent, radioactive and enzymatic markers and the like Therapeutic drugs 
25 contemplated with the present invention can include cytotoxic moieties such as ricin A 
chain, diphtheria toxin, pseudomonas exotoxin and other chemotherapeutic compounds 

It is well understood by one of skill in the, art that all of the above discussion 
regarding antibodies to P/CAF can also be applied with regard to production, 
30 characterization and use of antibodies which bind the p300/CBP protein or any of the 
DNA-binding transcription factors of this invention. 
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Measuring the P/CAF protein in a sample 

The present invention also provides a method for determining the presence and 
thus the amount of P/CAF protein in a biological sample. As used herein, a biological 
sample includes any tissue or cell which would contain the P/CAF protein. Examples of 
5 cells include tissues taken from surgical biopsies, isolated from a body fluid or prepared 
in an in vitro tissue culture environment. 

One example of determining the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

10 sequence of SEQ ED NO:3 under conditions whereby a P/CAF/p300 complex can be 
formed; and determining the amount of the P/CAF/p300 complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/p300 complex can be accomplished through techniques standard in the .art 
For example, the complex may be precipitated out of a solution and detected by the 

1 5 addition of a detectable moiety conjugated to the p300 protein or by the detection of an 
antibody which binds p300 or the P/CAF protein, as taught in the Examples herein. 
Antibodies which bind p300 or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of 
P/CAF/p300 complexes by the detection of the binding of antibodies reactive with p300 

20 or the P/CAF protein can be accomplished using various immunoassays as are available 
in the art, as described below. 

Alternatively, determination of the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

25 sequence of SEQ ID NO: 9 under conditions whereby a P/CAF/CBP complex can be 
formed; and determining the amount of the P/CAF/CBP complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/CBP complex can be accomplished through techniques standard in the art. 
For example, the complex may be precipitated out of a solution and detected by the 

30 addition of a detectable moiety conjugated to the CBP protein or by the detection of an 
antibody which binds either CBP or the P/CAF protein, as taught in the Examples 
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herein. Antibodies which bind CBP or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein Detection of P/CAF/CBP 
complexes by the detection of the binding of antibodies reactive with CBP or the P/CAF 
protein can be accomplished using various immunoassays as are available in the art, as 
5 described below. 

Another example of determining the amount of P/CAF in a biological sample 
comprises contacting the biological sample with an antibody which specifically binds 
P/CAF under conditions whereby a P/CAF/ antibody complex can be formed and 
10 determining the amount of the P/CAF/antibody complex, the amount of the complex 
indicating the amount of P/CAF in the sample. Antibodies which bind P/CAF can be 
either monoclonal or polyclonal antibodies and can be obtained as described herein 
Determination of P/CAF/antibody complexes can be accomplished using various 
immunoassays as are available in the art, as described below. 

15 

Immunoassays such as immunofluorescence assays, radioimmunoassays (RIA), 
immunoblotting and enzyme linked immunosorbent assays (ELISA) can be readily 
adapted for detection and measurement of P/CAF in a biological sample. Both 
polyclonal and monoclonal antibodies can be used in the assays Available 
20 immunoassays are well known in the art and are extensively described in the patent 
scientific literature. See, for example, U.S. Patent Nos. 3,791,932; 3,839,153; 
3,850,752; 3,850,578; 3,853,987; 3,867,517; 3,879,262; 3,901,654; 3,935,074; 
3,984,533; 3,996,345, 4,034,074; and 4,098,876. 

25 Screening assays for P/CAF 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltranst erase activity of P/CAF comprising contacting a 
system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 

30 amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 



WO 98/03652 PCTYUS97/ 1 2877 

25 

amount of histone acetylation by P/CAF in the absence of the substance, a decreased 
amount of histone acetylation by P/CAF in the presence of the substance indicating a 
substance that can inhibit the histone acetyltransferase activity of P/CAF. The 
acetylation of histones by P/CAF can be determined in a system including, for example, 
5 either core histones (histones H2A, H2B, H3 and H4) or the nucleosome core particles 
(146 base pairs of DNA wrapped around the octamer of core histones) as substrates, the 
P/CAF protein and radiolabeled acetyl-CoA (e.g., [l- I4 C]acetyl CoA) The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 
as described herein in the Examples. Thus, the compound to be tested for the ability to 
10 inhibit the histone acetyltransferase activity of P/CAF can be added to this system and 
assayed for inhibiting ability. 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the transcription modulating activity of P/CAF, comprising contacting a 

1 5 system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 
amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
amount of histone acetylation by P/CAF in the absence of the substance, a decreased 

20 amount of histone acetylation by P/CAF in the presence of the substance indicating a 

substance that can inhibit the transcription modulating activity and cell cycle progression 
suppressing activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2 A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 

25 octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 
(e.g., [l- u C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
Thus, the compound to be tested for the ability to inhibit the transcription modulating 
activity of P/CAF by interfering with the histone acetyltransferase activity of P/CAF can 

30 be added to this system and assayed for inhibiting ability. 
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Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 and P/CAF can occur; determining the amount 
5 of p300 binding to P/CAF in the presence of the substance; and comparing the amount 
of p300 binding to P/CAF in the presence of the substance with the amount of p300 
binding to P/CAF in the absence of the substance, a decreased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can inhibit the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
10 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the p300 protein comprising the amino acid sequence of SEQ ID NO: 3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

15 

Additionally provided in the present invention is a bioassay for screening 
substances for the ability to inhibit the binding of CBP to P/CAF, comprising contacting 
a system in which the binding of CBP to P/CAF can be determined, with the substance 
under conditions whereby the binding of CBP to P/CAF can occur; determining the 

20 amount of CBP binding to P/CAF in the presence of the substance, and comparing the 
amount of CBP binding to P/CAF in the presence of the substance with the amount of 
CBP binding to P/CAF in the absence of the substance, a decreased amount of CBP 
binding to P/CAF in the presence of the substance indicating a substance that can inhibit 
the binding of CBP to P/CAF. The binding of CBP to P/CAF can be determined in a 

25 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the CBP protein comprising the amino acid sequence of SEQ ID NO 9 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both CBP and P/CAF. Determination of the binding of CBP to P/CAF can be 
carried out as taught herein. 
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The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of P/CAF comprising 
contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
5 the substance, and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 
presence of the substance indicating a substance that can stimulate the histone 
acetyltransferase activity of P/CAF. The acetylation of histones by P/CAF can be 

10 determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl- 
CoA (e.g., [l- 14 CJacetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples 

1 5 Thus, the compound to be tested for the ability to stimulate the histone acetyltransferase 
activity of P/CAF can be added to this system and assayed for stimulating ability 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the transcription modulating activity of P/CAF comprising 

20 contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
the substance; and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 

25 presence of the substance indicating a substance that can stimulate the transcription 

modulating activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 
octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl- CoA 

30 (e.g., [l- l4 C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
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Thus, the compound to be tested for the ability to stimulate the transcription modulating 
activity of P/CAF by increasing the.histone acetyltransferase activity of P/CAF can be 
added to this system and assayed for stimulating ability. 

5 The present invention further provides a bioassay for screening substances for 

the ability to stimulate binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 to P/CAF can occur; determining the amount of 
p300 binding to P/CAF in the presence of the substance; and comparing the amount of 

10 p300 binding to P/CAF in the presence of the substance with the amount of p300 

binding to P/CAF in the absence of the substance, an increased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can stimulate the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
system, for example, which can include a cell free reaction mixture comprising a 

15 fragment of the p300 protein comprising the amino acid sequence of SEQ ID NO:3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

20 Additionally provided in the present invention is a bioassay for screening 

substances for the ability to stimulate the binding of CBP to P/CAF, comprising 
contacting a system in which the binding of CBP to P/CAF can be determined, with the 
substance under conditions whereby the binding of CBP to P/CAF can occur; 
determining the amount of CBP binding to P/CAF in the presence of the substance; and 

25 comparing the amount of CBP binding to P/CAF in the presence of the substance with 
the amount of CBP binding to P/CAF in the absence of the substance, an increased 
amount of CBP binding to P/CAF in the presence of the substance indicating a 
substance that can stimulate the binding of CBP to P/CAF The binding of CBP to 
P/CAF can be determined in a system, for example, which can include a cell free 

30 reaction mixture comprising a fragment of the CBP protein comprising the amino acid 
sequence of SEQ ID NO 9 and P/CAF Alternatively, the system can comprise a cell 
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extract produced from cells producing both CBP and P/CAF : Determination of the 
binding of CBP to P/CAF can be carried out as taught herein. 

Transcription modulating activity of P/CAF 

5 The present invention contemplates a method for inhibiting the transcription 

modulating activity of P/CAF in a subject, comprising administering to the subject a 
transcription modulating activity inhibiting amount of a substance in a pharmaceutical^ 
acceptable carrier. For example, the substance can be identified according to the 
protocols provided herein as one that can inhibit the transcription modulating activity of 

10 P/CAF by preventing the binding of P/CAF to p300/CBP or by inhibiting the histone 
acetyltransferase activity of P/CAF as well as by any other inhibitory mechanism as 
identified by the protocols provided herein. Inhibition of the transcription modulating 
activity of P/CAF in a subject is desirable, for example, to inhibit HIV TAT-mediated 
transcription and therefore, the method of the present invention can be used to treat 

1 5 HIV-infected subjects. 

The substance can be in a pharmaceutical^ acceptable carrier. By 
"pharmaceutically acceptable" is meant a material that is not biologically or otherwise 
undesirable, i.e., the material may be administered to a subject, along with the substance, 
20 without causing any undesirable biological effects or interacting in a deleterious manner 
with any of the other components of the pharmaceutical composition in which it is 
contained. The carrier would naturally be selected to minimize any degradation of the 
active ingredient and to minimize any adverse side effects in the subject. 

25 The transcription modulating activity and/or histone acetyltransferase activity of 

P/CAF can be inhibited in a subject by administering to the subject a substance which 
binds p300/CBP at the P/CAF binding site or a substance which binds the P/CAF 
protein at the p300/CBP binding site, the ultimate result being that P/CAF and 
p300/CBP do not bind with one another and P/CAF cannot exert its transcription 

30 modulating and/or histone acetyltransferase effect. The substance can be a protein, such 
as an antibody which binds the P/CAF protein binding site at or near the p300/CBP 
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binding site, thereby preventing its binding or an antibody which binds the p300/CBP 
protein at or near the P/CAF binding site, thereby preventing its binding. The substance 
can also bind the histone acetyltransferase site on P/CAF or at the acetylation site on the 
histone, thereby preventing acetylation by P/CAF 

5 

The substance which binds p300/CBP, the P/CAF protein or the histone and has 
the net effect of inhibiting the transcription modulating effect and or histone 
acetyltransferase activity of P/CAF in the cell can be delivered to a cell in the subject by 
mechanisms well known in the art. 

10 

Alternatively, a nucleic acid encoding a protein which binds either to p300/CBP 
or the P/CAF protein and has the net effect of inhibiting the transcription modulating 
effect and/or histone acetyltransferase activity of P/CAF in the cell can be delivered to a 
cell in the subject by gene transduction mechanisms well known in the art. For example, 
1 5 nucleic acid can be introduced by liposomes as well as via retroviral or adeno-associated 
viral vectors, as described below. 



The substance which inhibits the transcription modulating effect and/or histone 
acetyltransferase activity of P/CAF can be an antisense RNA or an antisense DNA which 
20 binds the RNA or DNA of P/CAF, thereby preventing translation or transcription of the 
RNA or DNA encoding P/CAF and having the net effect of inhibiting the transcription 
modulating effect and/or histone acetyltransferase activity of P/CAF by inhibiting P/CAF 
production. The antisense RNA of the present invention can be generated from the 
nucleic acid of SEQ ID NO:14 (human) or SEQ ID N015 (mouse). Furthermore, the 

25 antisense DNA can be a phosphorothioate oligodeoxyribonucleotide having the 

nucleotide sequence of SEQ ID NO: 16 (human) or of SEQ ID NO: 1 7 (mouse). The 
mouse antisense RNA can be used to inhibit the activity of mouse P/CAF, having the 
nucleotide sequence of SEQ ID NO: 18 and the amino acid sequence of SEQ ID NO: 8 
The present invention also contemplates an antisense nucleic acid sequence which can 

30 bind the DNA or RNA of any of the transcription factors or other proteins now known 
or later identified to bind P/CAF, thereby inhibiting expression of the gene products of 



WO 98/03652 



PCT/US97/12877 



31 

these proteins and having the net effect of inhibiting the transcription modulating effect 
and/or hist one acetyltransferase activity of P/CAF. 

The antisense nucleic acid can comprise a typical nucleic acid, but the antisense 
5 nucleic acid can also be a modified nucleic acid or a derivative of a nucleic acid such as a 
phosphorothioate analogue of a nucleic acid. The composition can comprise, for 
example, an antisense RNA that specifically binds an RNA encoded by the gene 
encoding the serum protein. Antisense RNAs can be synthesized and used by standard 
methods (62) 

10 

Antisense RNA can inhibit gene expression by forming an RNA/RNA duplex 
between the antisense RNA and the RNA transcribed from the target gene. The precise 
mechanism by which this duplex formation decreases the production of the protein 
encoded by the endogenous gene probably involves binding of complementary regions 

1 5 of the normal sense mRNA and the antisense RNA strand with duplex formation in a 
manner that blocks RNA processing and translation. Alternative mechanisms include 
the formation of a triplex between the antisense RNA and duplex DNA or the formation 
of an DNA-RNA duplex with subsequent degradation of DNA-RNA hybrids by RNAse 
H. Furthermore, an antigene effect can result from certain DNA-based oligonucleotides 

20 via triple-helix formation between the oligomer and double-stranded DNA which results 
in the repression of gene transcription. Regardless of the specific molecular mechanism, 
the present invention results in inhibition of expression of the P/CAF gene by the 
introduced and replicated DNA resulting in inhibition of the transcription modulating 
and/or histone acetyltransferase activity of P/CAF, by a reduction in the expression of 

25 the nucleic acid to which the antisense nucleic acid is hybridized, and therefore a 
reduction of the gene product from the targeted gene. 

The antisense nucleic acid may be obtained by any number of techniques known 
to one skilled in the art. One method of constructing an antisense nucleic acid is to 
30 synthesize a recombinant antisense DNA molecule. For example, oligonucleotide 

synthesis procedures are routine in the art and oligonucleotides coding for a particular 



WO 98/03652 PCT/US97/12877 

32 

protein or regulatory region are readily obtainable through automated DNA synthesis 
A nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 
5 5' or 3* overhangs at the termini for cloning into an appropriate vector. Double-stranded 
molecules coding for relatively large proteins or regulatory regions can be synthesized 
by first constructing several different double-stranded molecules that code for particular 
regions of the protein or regulatory region, followed by ligating these DNA molecules 
together. Once the appropriate DNA molecule is synthesized, this DNA can be cloned 
10 downstream of a promoter in an antisense orientation. Techniques such as this are 
routine in the art and are well documented. 

An example of another method of obtaining an antisense nucleic acid is to isolate 
that nucleic acid from the organism in which it is found and clone it in an antisense 

15 orientation. For example, a DNA or cDNA library can be constructed and screened for 
the presence of the nucleic acid of interest. Methods of constructing and screening such 
libraries are well known in the art and kits for performing the construction and screening 
steps are commercially available (for example, Stratagene Cloning Systems, La Jolla, 
CA). Once isolated, the nucleic acid can be directly cloned into an appropriate vector in 

20 an antisense orientation, or if necessary, be modified to facilitate the subsequent cloning 
steps. Such modification steps are routine, an example of which is the addition of 
oligonucleotide linkers which contain restriction sites to the termini of the nucleic acid 
General methods are set forth in Sambrook et al (39). 

25 The DNA that is introduced into the cell is in an expression orientation that is 

antisense to a corresponding endogenous DNA or RNA of the cells. For example, 
where an endogenous DNA comprises a gene which encodes for a particular protein, the 
introduced DNA is in an expression orientation opposite the expression of the 
endogenous DNA, that is the DNA operatively linked to a promoter is in an antisense 

30 expression orientation relative to the corresponding endogenous gene. The introduced 
DNA may be homologous to the entire transcribed gene or homologous to only part of 
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the transcribed gene. Alternatively, the sequence of the introduced DNA may be 
divergent to that of the endogenous DNA but only divergent to the extent that 
hybridization of the nucleic acids occurs, thereby preventing transcription. One skilled 
in the art can determine the maximum extent of this divergence by routine screening of 
5 antisense DNAs corresponding to an endogenous DNA of the cell. In this manner, one 
skilled in the art can readily determine which fragments, or alternatively the extent of 
homology of the fragments or the entire gene that is necessary to inhibit gene 
expression. 

10 The antisense nucleic acids of the present invention can be made according to 

protocols standard in the art, as well as described in the Examples provided herein. The 
antisense nucleic acids can be administered to a subject according to the gene 
transduction protocols standard in the art, as described below 

1 5 The present invention also contemplates a method for stimulating the 

transcription modulating activity and/or histone acetyltransferase activity of P/CAF in a 
subject comprising administering to the subject a substance, in a pharmaceutically 
acceptable carrier, determined according to the methods taught herein, to have a 
stimulatory affect on the transcription modulating and/or histone acetyltransferase 

20 activity of P/CAF. The substance can be one which has been identified, according to the 
protocols provided herein, to stimulate histone acetyltransferase activity in P/CAF or 
promote binding of P/CAF to p300/CBP. The stimulation of the transcription 
modulation activity and/or histone acetyltransferase activity of P/CAF in a subject is 
desirable, for example, to activate tumor suppressor p53 (which promotes apoptosis) or 

25 to activate the muscle differentiation factor, MyoD. Thus, the method of the present 
invention can be employed to treat cancer and to promote muscle differentiation in 
conditions where muscle differentiation is desired. The substance can be delivered to a 
cell in the subject by mechanisms well known in the art. 

30 Further contemplated in the present invention is a method for promoting binding 

of P/CAF to p300/CBP in a subject, comprising administering to the subject a substance 
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identified by the methods provided herein to promote binding of P/CAF to either p300 
or CBP 

Additionally, a nucleic acid encoding a protein which stimulates the transcription 
5 modulating activity and/or histone acetyltransferase activity of P/CAF can be delivered 
to a cell in the subject by gene transduction mechanisms, as described below. 

Also provided in the present invention is a method of inhibiting the cell cycle 
progression inducing effect of an oncoprotein which binds p300/CBP in a subject 

10 comprising transducing the cells of the subject with a vector comprising a nucleic acid 
encoding the P/CAF protein; inducing expression of the nucleic acid in the cell to 
produce the P/CAF in an amount which will allow the P/CAF gene product to replace 
the oncoprotein bound to p300/CBP, whereby the replacement of the oncoprotein 
bound to p300/CBP by the P/CAF gene product inhibits the cell cycle progression 

1 5 inducing effect of the oncoprotein. The oncoprotein which binds p300/CBP in the cell 
can be the adenovirus El A oncoprotein. 



A method for providing a functional P/CAF protein to a subject in need of the 
functional P/CAF protein is also provided, comprising transducing the cells of the 

20 subject with a vector comprising a nucleic acid encoding the P/CAF protein and 

inducing expression of the nucleic acid to produce the functional P/CAF protein in the 
cell, thereby providing the functional P/CAF protein to the subject. The transduction of 
the vector nucleic acid into the subject's cells can be carried out according to standard 
gene therapy protocols well known in the art (see, for example, U.S. Patent No. 

25 5,339,346). 

Screening assays for p300/CBP 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of p300/CBP comprising 
30 contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance under conditions whereby histone acetylation by p300/CBP can occur; 
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determining the amount of histone acetylation by p300/CBP in the presence of the 
substance, and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased amount of histone acetylation by p300/CBP in the 
5 presence of the substance indicating a substance that can inhibit the histone 

acetyltransferase activity of p300/CBP. The acetylation of histones by p300/CBP can be 
determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P300/CBP protein and radiolabeled 
1 0 acetyl-CoA (e.g., [ l- I4 C]acetyl CoA). The presence of acetylated histones can be 

detected by autoradiography after separation by SDS-PAGE as described herein in the 
Examples. Thus, the compound to be tested for the ability to inhibit the histone 
acetyltransferase activity of p300/CBP can be added to this system and assayed for 
acetyltransferase inhibiting ability. 

15 

Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of a transcriptional factor to p300/CBP, comprising 
contacting a system in which the binding of a transcriptional factor to p300/CBP can be 
determined, with the substance under conditions whereby the binding of the 

20 transcriptional factor and p300/CBP can occur; determining the amount of 

transcriptional factor binding to p300/CBP in the presence of the substance; and 
comparing the amount of transcriptional factor binding to p300/CBP in the presence of 
the substance with the amount of transcriptional factor binding to p300/CBP in the 
absence of the substance, a decreased amount of transcriptional factor binding to 

25 p300/CBP in the presence of the substance indicating a substance that can inhibit the 
binding of a transcriptional factor to p300/CBP. The binding of a transcriptional factor 
to p300/CBP can be determined in a system, for example, which can include a cell free 
reaction mixture comprising a transcriptional factor which binds p300/CBP and 
p300/CBP Alternatively, the system can comprise a cell extract produced from cells 

30 producing both a transcriptional factor which binds p300/CBP and p300/CBP. The 
transcriptional factor which binds p300/CBP can be selected from, but is not limited to 
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the group consisting of nuclear hormone receptors, CREB, c-Jun/v-Jun, c-Myb/v-Myb, 
YYI, Sap- la, c-Fos, MyoD and SRC-1, as well as any other transcriptional factor now 
known or later identified to bind p300/CBP. The screening assay of the present 
invention can also be used to identify substances which inhibit the binding of p300/CBP 
5 to other components to which it is known to bind, for example, P/CAF, pp^^, TFIIB, 
El A, SV40 large T antigen, as well as any other substances now known or later 
identified to bind p300/CBP. Determination of the binding of a transcriptional factor or 
other substance to p300/CBP can be carried out as taught in the Examples herein as well 
as by protocols described in the literature. 

10 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of p300/CBP comprising 
contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance; determining the amount of histone acetylation by p300/CBP in the 

1 5 presence of the substance; and comparing the amount of histone acetylation by 

p300/CBP in the presence of the substance with the amount of histone acetylation by 
p300/CBP in the absence of the substance, an increased amount of histone acetylation 
by p300/CBP in the presence of the substance indicating a substance that can stimulate 
the histone acetyltransferase activity of p300/CBP. The acetylation of histones by 

20 p300/CBP can be determined in a system including, for example, either core histones 
(histones H2A, H2B, H3 and H4) or the nucleosome core particles (146 base pairs of 
DNA wrapped around the octamer of core histones) as substrates, the p300/CBP 
protein and radiolabeled acetyl-CoA (e.g., [l- 14 C]acetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 

25 as described herein in the Examples. Thus, the compound to be tested for the ability to 
stimulate the histone acetyltransferase activity of p300/CBP can be added to this system 
and assayed for stimulating ability. 



30 



The present invention further provides a bioassay for screening substances for 
the ability to stimulate binding of a component, which binds p300/CBP, to p300/CBP, 
comprising contacting a system in which the binding of the component to p300/CBP can 
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be determined, with the substance under conditions whereby the binding of the 
component to p300/CBP can occur; determining the amount of component binding to 
p300/CBP in the presence of the substance; and comparing the amount of component 
binding to p300/CBP in the presence of the substance with the amount of component 
5 binding to p300/CBP in the absence of the substance, an increased amount of 

component binding to p300/CBP in the presence of the substance indicating a substance 
that can stimulate the binding of the component to p300/CBP. The binding of the 
component to p300/CBP can be determined in a system, for example, which can include 
a cell free reaction mixture comprising the component and p300/CBP Alternatively, the 

10 system can comprise a cell extract produced from cells producing both the component 
and p300/CBP. The component which binds p300/CBP can be any of the transcriptional 
factors or other proteins which are known or are identified in the future to bind 
p300/CBP, as set forth above. Determination of the binding of the component to 
p300/CBP can be carried out as taught in the Examples provided herein and according 

1 5 to protocols available in the literature. 

Histone acetyl transferase activity of p300/CBP 

A method for inhibiting the histone acety transferase activity of p300/CBP in a 
subject is provided in the present invention, comprising administering to the subject a 
20 histone acetyltransferase activity inhibiting amount of a substance in a pharmaceutical^ 
acceptable carrier. The mechanism of the inhibitory action of the substance can be the 
inhibition of the binding of a DNA-binding transcription factor, such as, for example, a 
nuclear hormone receptor, CREB, c-Jun/v-Jun, c-Myb/v-Myb, YY1, Sap- la, c-Fos, 
MyoD or SRC-1, to p300/CBP. 

25 

The histone acetyltransferase activity of p300/CBP can be inhibited in a subject 
by administering to the subject a substance which binds p300/CBP at the transcription 
factor binding site or a substance which binds the transcription factor protein at the 
p300/CBP binding site, the ultimate result being that the transcription factor and 
30 p300/CBP do not bind with one another and p300/CBP cannot acetylate histones 
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The substance which binds either to the transcription factor or the p300/CBP 
protein and has the net effect of inhibiting the histone acetyltransferase activity of 
p300/CBP in the cell can be identified according to the screening methods provided 
herein and delivered to a cell in the subject by mechanisms well known in the art. The 
5 substance can be a protein, such as an antibody which binds the p300/CBP protein 
binding site at or near the DNA-binding transcription factor binding site, thereby 
preventing its binding or an antibody which binds the DNA-binding transcription factor 
at or near the p300/CBP binding site, thereby preventing its binding. The substance can 
also bind the histone acetyltransferase site on p300/CBP (aa 1 195-1673 on p300 or aa 
10 1 174-1850 on CBP) or at the acetylation site on the histone, thereby preventing 
acetylation by p300/CBP. 

Additionally, the substance can be a nucleic acid which can be expressed in the 
cell to produce a protein which inhibits the histone acetyltransferase activity of 

15 p300/CBP. For example, a nucleic acid encoding a protein which binds either to a 
transcription factor or the p300/CBP protein and has the net effect of inhibiting the 
histone acetyltransferase activity of p300/CBP in the cell can be delivered to a cell in the 
subject by gene transduction mechanisms well known in the art. For example, nucleic 
acid can be introduced by liposomes as well as via retroviral or adeno-associated viral 

20 vectors, as described below 

The substance which inhibits the histone acetyltransferase activity of p300/CBP 
can be an antisense RNA or an antisense DNA which binds the RNA or DN A of 
p300/CBP thereby preventing translation or transcription of the RNA or DNA encoding 
25 p300/CBP and having the net effect of inhibiting the histone acetyltransferase activity of 
P/CAF by inhibiting p300/CBP production. The antisense RNA or DNA of the present 
invention can be produced and introduced into cells according to the same methods as 
set forth above for P/CAF antisense nucleic acids. 

30 The present invention also contemplates a method for stimulating the histone 

acetyltransferase activity of p300/CBP in a subject comprising administering to the 
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subject a histone acetyltransferase activity stimulating amount of a substance, in a 
pharmaceutical^ acceptable carrier, determined according to the methods taught herein, 
to have a stimulatory affect on the histone acetyltransferase activity of p300/CBP. The 
substance can exert a stimulatory effect by promoting the binding of a DNA-binding 
5 transcription factor of the present invention to p300/CBP The substance can be 
delivered to a cell in the subject by mechanisms well known in the art. A nucleic acid 
encoding a protein which stimulates the transcription modulating activity of p300/CBP 
can be delivered to a cell in the subject by gene transduction mechanisms, as described 
below. 

10 

Gene transduction 

In the methods described above which include gene transduction into cells (i.e., 
addition of exogenous DNA into cells), the nucleic acids of the present invention can be 
in a vector for delivering the nucleic acids to the site for expression of the P/CAF 

1 5 protein. The vector can be one of the commercially available preparations, such as the 
pGM plasmid (Promega). Vector delivery can be by liposome, using commercially 
available liposome preparations or newly developed liposomes having the features of the 
present liposomes. Additionally, vector delivery can be via a viral system, including, but 
not limited to, retroviral, adenoviral and adeno-associated viral systems. Other delivery 

20 methods can be adopted and routinely tested according to the methods taught herein. 

The modes of administration of the liposome will vary predictably according to 
the disease being treated and the tissue being targeted. For example, for treating cancer 
in either the lung or the liver, which are both sinks for liposomes, intravenous delivery is 

25 reasonable. For other localized cancers, as well as precancerous conditions, 

catheterization of an artery upstream from the target organ is a preferred mode of 
delivery, because it avoids significant clearance of the liposome by the lung and liver 
For cancerous lesions at a number of other sites (e.g., skin cancer, localized dysplasias), 
topical delivery is expected to be effective and may be preferred, because of its 

30 convenience. 
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Leukemias and other disorders involving dysregulated proliferation of certain 
isolatable cell populations may be more readily treated by ex vivo administration of the 
nucleic acid. 

5 The liposomes may be administered topically, parenterally (e.g., intravenously), 

by intramuscular injection, by intraperitoneal injection, transdermal^, extracorporeal^ 
or the like, although intravenous or topical administration is typically preferred The 
exact amount of the liposomes required will vary from subject to subject, depending on 
the species, age, weight and general condition of the subject, the severity of the disease 
10 being treated, the particular compound used, its mode of administration and the like 
Thus, it is not possible to specify an exact amount. However, an appropriate amount 
may be determined by one of ordinary skill in the art using only routine experimentation 
given the teachings herein. 

15 Parenteral administration, if used, is generally characterized by injection 

Injectables can be prepared in conventional forms, either as liquid solutions or 
suspensions, solid forms suitable for solution of suspension in liquid prior to injection, or 
as emulsions. A more recently revised approach for parenteral administration involves 
use of a slow release or sustained release system such that a constant level of dosage is 

20 maintained. See, e.g., U.S. Patent No. 3,610,795, which is incorporated by reference 
herein. 

Topical administration can be by creams, gels, suppositories and the like. Ex 
vivo (extracorporeal) delivery can be as typically used in other contexts. 

25 

The present invention is more particularly described in the following examples 
which are intended as illustrative only since numerous modifications and variations 
therein will be apparent to those skilled in the art. 
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EXAMPLES 

L P/CAF studies. 

5 Cloning and characterization of P/CAF protein. 

In human cells, CBP binds to c-Jun in a phosphoryiation-dependent manner in 
association with stimulation of transcription (9). In yeast, GCN4 is believed to be a c- 
Jun counterpart on the basis of similarities in DNA recognition ( 1 5) as well as the 
participation of both proteins in UV signaling pathways (16). Yeast genetic screening 

10 has led to the isolation of various co factors for GCN4, including GCN5 (yGCN5), 
ADA2 (yADA2) and AD A3 (yADA3) (17-19). These factors are considered to 
function as a complex (or in a common pathway) based on genetic and protein-protein 
interaction studies (18-22). Finally, p300/CBP and yADA2 exhibit significant sequence 
similarity within a 50 amino acid region including a Zn 2+ finger motif (3). Human 

15 counterparts to yGCN5, yADA2, or yADA3 that interact with p300/CBP to mediate 
transcriptional activation by c-Jun were searched for in various nucleotide sequence 
databases. 

Comparison of the yGCN5 protein sequence with various databases (23) 
20 revealed significant similarities with the two randomly sequenced human cDNAs, 

ETS05039 (24) (P=4.0xl0* 15 ) and NIB2000-5R (P=6.5xl0* 9 ). Given that these cDNAs 
were truncated, human fetal liver and fetal brain cDNA libraries (Clontech) were 
screened with ETS05039 and NIB2000-5R, respectively and complete clones were 
isolated from the human fetal liver cDNA library. The complete sequences revealed that 
25 the ETS05039- and NIB2000-5R-derived clones are encoded by distinct genes but are 
highly related within the protein coding regions (68% identity at the DNA level; 75% 
identity and 86% similarity at the protein level). The former encodes an N-terminal 
region with no sequence similarity to any proteins in the databases besides the yGCN5- 
related C-terminal region, whereas the latter encodes only the yGCN5-related region 
30 Given that p300/CBP -binding activity was observed in the former polypeptide as shown 
below, it was designated p300/CBP-associated factor (P/CAF), having the amino acid 
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sequence of SEQ ID NO: 1 and the nucleotide sequence of SEQ ED NO: 1 0 and the latter 
was named human GCN5 (hGCN5), having the amino acid sequence of SEQ CO NO : 5 
arid the nucleotide sequence of SEQ ID NO: 1 1 

5 Additionally, an RNA blot (Clontech) was hybridized with a random-primed 

probe made from the cDNA encoding P/C AF. RNA blotting indicated that transcripts 
detected by the P/CAF and hGCN5 cDNAs are ubiquitously expressed, but the former is 
most abundant in heart and skeletal muscle, whereas the latter is most abundant in 
pancreas and skeletal muscle. 

10 

P/CAF-p300/CBP interaction in vitro 

The P/CAF binding site was presumed to reside in the C terminal one third of. 
CBP (residues 1,678-2,442) because it was observed that this region, when fused to a 
DNA binding domain, activates transcription (4) in a manner repressed by coexpression 
15 of 1 2S E 1 A. This region was divided into 6 overlapping fragments and each was 
expressed inE. coli as a glutathione-S-transferase (GST) fusion protein. GST-CBP 
fusions were incubated with recombinant P/CAF protein and, subsequently, purified 
using glutathione-Sepharose. Co-purified P/CAF was detected by immunoblotting 
analysis. 

20 

To construct GST-fusions, various regions of CBP and p300 were amplified by 
PCR. A series of deletions of the CBP segment B was created by site-directed in vitro 
mutagenesis (30). These fragments were subcloned into pGEX-2T (Pharmacia). GST- 
fusions were expressed in E. coli and extracted with buffer B [20 mM Tris-HCl (pH 

25 8.0), 5 mM MgCl 2 , 10% glycerol, 1 mM AEBSF, 0. 1% NP40, 10 ug/ml of aprotinin, 10 
ug/ml of leupeptin, 1 ug/ml of pepstatin A, 1 mM DTT] containing 0.1 M KC1 for these 
experiments. GST-CBP-segment B was purified by glutathione-Sepharose and phenyl- 
Sepharose chromatographic steps, P/CAF, hGCN5, and E1A were expressed as FLAG- 
fusions in Sf9 cells via baculovirus vectors and affinity-purified with M2-agarose (ref 

30 30; Kodak-EBI). For interaction, a crude E. coli extract containing 20 pmol of GST- 
fusion was incubated with 40-60 pmol of P/CAF or El A in a total volume of 50 ul of 
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buffer B with 0. 1 M KC1 on ice for 1 0 min. Samples were further incubated with 1 0 \il 
(packed volume) of glutathione-Sepharose at 4°C for 30 min, washed four times with 
200 pi of buffer B containing 0. 1 M KC1, and eluted with 20 \il of buffer E [50 mM 
Tris-HCl (pH 8.0), 0.2 M KG, 20 mM glutathione] for 60 min. Interacting proteins 
5 were detected by anti-FLAG immunoblotting or silver staining. 

For p300 interactions, the segment spanning residues 1763-1966 (segment B') of 
p300, which is analogous to the CBP segment-B, was used. Twenty percent of the 
P/CAF and hGCN5 inputs and 100% of the El A input were also analyzed. In the GST 
10 precipitation assays, almost identical amounts of the GST fusions were recovered in all 
samples. Interaction between P/CAF and CBP (segment B) was determined in the 
absence and in the presence of El A. Control reactions with GST-CBP alone and 
without GST-CBP were also performed Input proteins were analyzed 

15 Two CBP segments, A and B, interacted specifically with P/CAF The stronger 

interaction was observed in the latter segment, which does not include the yADA2-like 
Zn 2+ finger. Given that the CBP segment-B is well conserved in p300 (66% identity, 
75% similarity), the binding of P/CAF to p300 in vitro was also analyzed. For this 
experiment, the p300 segment spanning residues 1763-1966, termed segment B\ which 

20 is analogous to the CBP segment-B, was used. Like CBP, p300 interacted specifically 
with P/CAF. These studies demonstrated that P/CAF binds specifically to both p300 
and CBP in vitro In contrast to P/CAF, hGCN5 did not bind to CBP or p300. 

These studies also demonstrated that the Zn 2+ finger region of p300/CBP, which 
25 shares sequence similarity with yADA2, is not essential for the interaction with P/CAF 
Cloning of a human structural homolog of yADA2, termed hADA2 (25) has revealed 
that, unlike the sequence similarity between p300/CBP and yADA2, which is restricted 
to a 50 amino acid region, hADA2 shares extensive similarity (30% identity, 52% 
similarity) to yADA2 over the entire protein sequence. Moreover, a computer search of 
30 the complete genomic sequence of Saccharomyces cerevisiae revealed that yeast does 
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not have counterparts of p300/CBP or P/CAF Thus, the p300/CBP-P/CAF pathway 
may have been acquired during metazoan evolution. 



5 Action of El A in vitro 

Previous reports indicated that El A binds to both the p300 segment spanning 
residues 1767-1816 and the CBP segment spanning residues 1805-1854 (7). These 
interactions were reconfirmed in the present system; thus, both p300 and CBP segments 
covering the previously identified regions interacted with El A. 

10 

For further mapping, a series of deletions was introduced within the CBP 
segment-B and tested for interactions with P/CAF and El A. Deletions of residues 
1801-1825 or 1824-1851 markedly reduced interactions with both P/CAF and El A, 
whereas deletion of residues 1850-1878 did not affect these interactions. Furthermore, 
15 deletion of residues 1801-1851 completely abolished interactions with both P/CAF and 
El A. These data indicate that residues 1801-1851 of CBP are critical for interaction 
with both P/CAF and El A. Taken together with the evidence that CBP segment A (aa 
residues 1,678-1,880) also binds to these factors, the above findings demonstrate that 
P/CAF and El A bind to the same or very closely spaced sites on CBP 

20 

Evidence that both P/CAF and El A recognize the same p300/CBP segments 
raises the possibility of direct competition between P/CAF and El A for binding to 
p300/CBP. To test this possibility, a competition experiment was performed with the 
use of affinity purified recombinant proteins The interaction of P/CAF with the CBP- 
25 segment B was progressively inhibited by the addition of increasing amounts of El A In 
contrast, no inhibition was caused by an El A mutant which does not bind to p300/CBP 
(El AAN). Similar results were obtained with the p300-segment B\ leading to the 
conclusion that P/CAF and El A compete for the same binding sites in p300/CBP. 



30 
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P/CAF-p300/CBP interaction in vivo 

The in vivo interaction between P/CAF and p300/CBP was established by co- 
immunoprecipitation from a human osteosarcoma cell extract. Proteins in this extract 
were immunoprecipitated with rabbit anti-P/CAF, rabbit anti-CBP and anti-p300 
5 antibodies. For controls, cell extract was precipitated with rabbit control IgG or mouse 
anti-HA monoclonal antibody. The precipitates were analyzed by immunoblotting with 
anti-P/CAF, anti-CBP and anti-p300 antibodies. 



Osteosarcoma cells were transfected with either control vector or El A- or 
10 El AAN-expression vectors. Extract from the transfected subpopulation was 

immunoprecipitated with anti-P/CAF or control IgG. The precipitates were analyzed by 
immunoblotting with anti-p300 and anti-P/CAF antibodies. 

Rabbit anti-P/CAF antibody was raised to the P/CAF segment spanning residues 

15 125-397 and purified by immunoaffinity chromatography (33). A mixture of v 

monoclonal antibodies raised to the human p300 segment spanning residues 1572-2371 
(5) and rabbit polyclonal antibodies raised to the mouse CBP segment spanning residues 
2-23 (for immunoprecipitation) and 1736-2179 (immunoblotting) were purchased from 
Upstate Biotechnology. Approximately 2 x 10 7 human osteosarcoma U-2 OS cells 

20 (ATCC accession number HTB 96) were extracted with 10 ml of lysis buffer [25 mM 
HEPES-KOH (pH 7.2), 150 mM potassium acetate, 2 mM EDTA, 1 mM DTT, 1 mM 
AEBSF, 10 ng/ml of aprotinin, 10 ng/ml of leupeptin, 1 ng/ml of pepstatin A, 20 mM 
sodium fluoride, 0. 1% NP40]. Two to 10 ml of extract were incubated with 2 ^g of the 
respective antibody for four hours at 4°C. Fifty \l\ (packed volume) of protein- A 

25 Trisacryl (Pierce) were added and incubation was continued for two hours. The matrix 
was washed four times with 1 ml of the lysis buffer, then boiled in 2x SDS sample 
buffer. Human osteosarcoma U-2 OS cells were transfected with 20 ng of the indicated 
plasmid and 1 ng of sorting plasmid (pCMV-IL2R)(31). The transfected subpopulation 
was purified by magnetic affinity cell sorting (32) Extract from approximately 2 x 10 s 

30 sorted cells was immunoprecipitated as described. 
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Anti-P/CAF antibody specifically detected a 95 kDa protein, which is very close 
to the calculated value for the full-length P/CAF, in the immunoprecipitates Anti- 
P/CAF antibody co-immunoprecipitated both CBP and p300. Similarly, anti-CBP 
antibody also co-immunoprecipitated P/CAF. However, anti-p300 antibody did not co- 
5 immunoprecipitate P/CAF. This is most likely due to steric interference since the anti- 
p300 antibody was raised to the p300 segment spanning residues 1572-2371 which 
includes the P/CAF binding region. These data demonstrate that P/CAF forms 
complexes with both p300 and CBP in vivo. 

10 Action of E1A in vivo 

The in vitro experiments described herein indicate that P/CAF and El A compete 
for the binding sites in p300/CBP. Thus, a study was conducted to determine whether 
El A targets the endogenous interaction between P/CAF and p300 An E 1 A-expression 
vector was transiently transfected into human osteosarcoma cells and the transfected 

15 subpopulation was purified by cell sorting. Then, the interaction between P/CAF and 
p300 in transfected cells was examined by co-immunoprecipitation with anti-P/CAF 
antibody. The endogenous interaction of P/CAF with p300 was drastically inhibited by 
expression of El A. On the other hand, no inhibition was observed by the El A mutant 
lacking the p300 binding domain (E1AAN), indicating that El A disrupts the P/CAF- 

20 p300 complex in vivo through an interaction with p300 

Cell cycle regulation by P/CAF 

Given that binding of P/CAF to p300/CBP is inhibited by El A, experiments 
were performed to evaluate whether P/CAF, by binding to and forming a functional 
25 complex with p300, is involved in the regulation of entry into S phase. This possibility 
was addressed by examining whether transient expression of P/CAF would affect the 
rate of Gl/S transit in HeLa cells P/CAF negatively affected the distribution of cells 
between Gl and S phases in this assay 

30 HeLa cells were transfected by electroporation with 7 jig of P/CAF-expression 

plasmid and/or 3 fig of the full-length or the N-terminally deleted (A2-36) El A 12S- 
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expression plasmid as indicated. These plasmids were constructed by subcloning 
FLAG-P/CAF and El A cDNAs into pCX (34) and pcDNAI (Invitrogen), respectively 
All samples, in addition, contained 1 ng of sorting plasmid (pCMV-EL2R) (31) and 
carrier plasmid (pCX) to normalize the total amount of DNA to 1 1 ng. After 
5 transfection, cells were incubated in Dulbecco's modified Eagle's medium with 10% fetal 
bovine calf serum for 12 h, and subsequently labeled in medium containing 10 fiM 
bromo-deoxyuridine (BrdU) for 30 min. Subsequently, the transfected subpopulation 
was purified by magnetic affinity cell sorting and nuclei were analyzed by dual parameter 
flow cytometry as described (32). 

10 

The fraction of cells accumulating in S phase in control cultures was 23%, 
compared to 15% in P/CAF-transfected cells. This effect was reproducible in multiple 
independent experiments. In parallel experiments to verify the utility of this 
experimental protocol, plasmids encoding E2F-1, simian virus 40 small t, cyclin A or 
1 5 cyclin E increased the accumulation of cells in S phase, whereas plasmids encoding the 
cyclin-dependent kinase inhibitors p21 or p27 reduced the number of S phase cells. 

On the basis of evidence that El A and P/CAF compete for binding sites on 
p300, it seemed possible that cotransfection of P/CAF with El A would oppose the 

20 mitogenic effect caused by El A. As shown by the data herein, this is indeed the case. 
El A alone has mitogenic activity in this experimental setting, while the El A mutant 
lacking the p300 binding domain (El AAN) has very weak activity. Comparable 
expression levels between wild type and mutant El A in the transfected cells were 
revealed by immunoblotting analysis with anti-El A. Intriguingly, when P/CAF was 

25 cotransfected with El A, the mitogenic activity of El A was significantly counteracted by 
P/CAF. These results show that P/CAF and El A mediate antagonistic effects on cell 
cycle progression, — 

In the course of assessing P/CAF activity, it was also revealed that p 3 00 is able 
30 to inhibit cell cycle progression under the same assay conditions. These findings suggest 
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that P/CAF and p300, perhaps by forming a complex, act in concert to suppress cell 
cycle progression. 

Histone acetyltransferase activity in P/CAF 

5 Acetylation of the N-terminal histone tails has been considered to play a crucial 

role in accessibility of transcription factors to nucleosomal templates (26-27). Recently, 
yGCN5 has been identified as a histone acetyltransferase (28) On the basis of this 
information, intrinsic histone acetyltransferase activity in P/CAF and hGCN5 was 
examined. As substrates, the core histones (histones H2A, H2B, H3 and H4) and the 
10 nucleosome core particles (146 base pairs of DNA wrapped around the octamer of core 
histones) were used. 

Activity of hGCN5 and P/CAF that acetylates free histones or histones in the 
nucleosome core particle (35) was measured as described (36). Each reaction contained 
15 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol of the histone 
octamer or the nucleosome core particle and 10 pmol of [l- 14 C]acetyl-CoA. The 
histone octamer dissociated into dimers or tetramers under assay conditions. Acetylated 
histones were detected by autoradiography after separation by SDS-PAGE. 

20 P/CAF and hGCN5 acetylated the core histones with almost the same efficiency 

Both factors acetylated histones H3 and H4, but preferentially H3. In contrast, very 
weak or no acetylation by hGCNS was detected in the nucleosome core particles. 
Remarkably, significant acetylation by P/CAF was observed in a nucleosomal context. 
Although all core histones are acetylated in the nucleus, P/CAF and hGCNS did not 

25 acetylate histones H2A and H2B in vitro. 

Direct function of P/CAF is likely to involve its intrinsic histone acetyltransferase 
activity. Although exact molecular mechanisms by which acetylation of core histones 
contribute to transcription remains undefined, acetylation of the histones is considered to 
30 play an important role in transcriptional regulation (26-27). The positively charged N- 
terminal tails of core histones are believed to affect nucleosome structure by interacting 
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with DNA at or near the nucleosome-spacer junction. Acetylation of the histone tails 
presumably destabilizes the nucleosome and facilitates access by regulatory factors. 
Likewise, there is a general correlation between the level of acetylation and 
transcriptional activity of nucleosomal domains. The findings of the present invention 
5 provide insights into the mechanisms of targeted histone acetylation. 

Cellular factor p300/CBP binds to various sequence-specific factors that are 
involved in cell growth and/or differentiation, including CREB (3,4), c-Jun (9), Fos (1 1), 
c-Myb (12) and nuclear receptors (13). P/CAF could stimulate the activation function 
10 of these factors via promoter-specific histone acetylation. The present invention 

demonstrates that El A appears to perturb normal cellular regulation by disrupting the 
connection between p300/CBP and its associated histone acetyltransferase. 

II. D300/CBP studies. 

15 

Purification of El A associated histone acetyltransferase. 

FLAG-epitope tagged El A (or AE1 A) was expressed in Sf9 cells (ATCC 
accession number CRL 171 1) by infecting recombinant baculovirus (43). All purification, 
steps were carried out at 4°C. Extract was prepared from infected cells by one cycle of 

20 freeze and thaw in buffer B (20 mM Tris-HCl, pH 8.0; 5 mM MgCl 2 , 10% glycerol; 1 
mM PMSF, 10 mMp-mercaptoethanol, 0, 1% Tween 20) containing 0. 1 
M KC1 and the complete protease inhibitor cocktail (Boehringer Mannheim). To 
prepare ElA-immobilized beads, the extract was incubated with M2 
anti-FLAG antibody agarose (Kodak-IBI) for four hours with rotating and 

25 subsequently washed with the same buffer three times. The resulting beads were 

incubated with HeLa (ATCC accession number CCL 2) nuclear extract for four to eight 
hours and thereafter washed with the same buffer six times. Finally, FLAG-E1 A was 
eluted from the beads along with associated polypeptides by incubating with the same 
buffer containing 0. 1 mg/ml FLAG peptide. 
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For further purification, eluted polypeptides were dialyzed in 0.05 M KCi-buffer 
B and subsequently loaded onto a SMART Mono Q column (Pharmacia) equilibrated 
with the same 0.05 M KCl-buffer B. After washing, the column was developed with a 
linear gradient of 0.05-1 .0 M KC1 in buffer B. Mono Q fractions were concentrated with 
5 a MICROCON spin-filter (Amicon) and consequently loaded onto a SMART Superdex 
200 column (Pharmacia) equilibrated with 0. 1 M KCl-buffer B. 

Histone acetyltransferase assays 

Filter binding assays were performed as described (80) with minor modifications 

10 Samples were incubated at 30°C for 10-60 minutes in 30 ml of assay buffer containing 
50 mM Tris-HCl, pH 8.0; 10% glycerol; 1 mM DTT; 1 mM PMSF; 10 mM sodium 
butyrate; 6 pmol of [ 3 H]acetyl CoA (4.3 mCi/mmole, Amersham Life Science Inc.); and 
33 mg/ml of calf thymus histones (Sigma Chemical Co.). In experiments where synthetic 
peptides were substituted for core histones, 50 pmol of each peptide were used. After 

15 incubation, the reaction mixture was spotted onto Whatman P-81 phosphocellulose filter 
paper and washed for 30 minutes with 0.2 M sodium carbonate buffer pH 9 2 at room 
temperature with 2-3 changes of the buffer. The dried filters were counted in a liquid 
scintillation counter. 

20 PAGE analysis was done as above except that 90 pmol of [ 14 C]acetyl CoA (55 

mCi/mmole, Amersham Life Science Inc.) and 9 pmol of core histones or 
mononucleosomes were used. Core histones and mononucleosomes were prepared as 
described (35). For trypsin digestion, reaction mixtures were further incubated with 
various amounts of trypsin on ice for 30 minutes. The samples were analyzed on one 

25 dimensional SDS-PAGE gels or two dimensional gels, where the first dimension was an 
acid-urea-PAGE gel (44) and the second dimension was an SDS-PAGE gel. 

Protein expression 

For baculovirus expression, cDNAs corresponding to p300 portions of aa 1 -670, 
30 aa 671-1 194 and aa 1 135-2414 were amplified by PCR (EXPAND High Fidelity PCR 
System; Boehringer Mannheim) as KpnI-NotI fragments. The resulting fragments were 
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subcloned into a baculovirus transfer vector having the FL AG-tag sequence (43). The 
recombinant viruses were isolated using the BACULOGOLD system (Phanningen), 
according to the manufacturer's protocol and were infected into Sf9 cells (ATCC 
accession number CRL 171 1) to express FLAG-p300. Recombinant proteins were 
5 affinity purified with M2 anti-FLAG antibody-immobilized agarose (Kodak-IBI) 
according to the manufacturer's protocol. 

For bacterial expression, cDNAs encoding the p300 portions and the CBP 
portion (aa 1 174-1850) were first subcloned into the baculovirus transfer vector having 
10 the FLAG-tag as described above. Thereafter, the Xhol and NotI fragments encoding 
FLAG-p300 or FLAG-CBP fusions were resubcloned into the E. coli expression vector 
pET-28c (Novagene) digested with Sail and NotI Recombinant proteins were 
expressed in E. coli BL21(DE3) and affinity purified with M2-antibody agarose 

15 Histone acetyltransferases that associate with El A 

Although the adenovirus El A 12S protein (El A) inhibits transcription in a 
variety of genes via direct binding to p300/CBP (45), El A also stimulates transcription 
in some contexts (46). Thus, p300/CBP-bound El A was tested to determine whether it 
might recruit histone acetyltransferases or deacetylases to regulate transcription. In 
20 addition, experiments were conducted as described below to determine if p300/CBP per 
se is a histone acetyltransferase. 

Initially, recombinant FLAG-epitope tagged El A was immobilized on 
anti-FLAG antibody beads. Immobilized El A was incubated with a HeLa nuclear 

25 extract for affinity purification of El A-associated polypeptides FLAG-E1 A 
was then eluted from the beads, along with El A-associated polypeptides, by 
incubating with FLAG-peptide. Although El A per se has no histone acetyltransferase 
activity, El A recruited significant amounts of histone acetyltransferase activity from the 
nuclear extract. It is very unlikely that this activity is derived from P/CAF given that 

30 El A and P/CAF cannot bind to p300/CBP simultaneously (43). Consistent with this, no 
P/CAF was detected in these fractions by immunoblotting. 
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The El A N-terminus, a region that is not highly conserved among the various 
adenovirus serotypes, is involved in p300/CBP binding in vivo. Mutations in the 
N-terminal region lead to loss of the ability for p300/CBP binding without affecting RB 

5 binding (1,47). Thus, the requirement of the El A N-terminal region for the recruitment 
of hist one acetyltransferase activity was tested. In contrast to the wild type, the 
N-terminal deleted form of El A (AN-E1 A) recruited only a background level of 
acetyltransferase activity. In agreement with previous reports (47), the AN-E1 A 
showed no ability to interact with p300/CBP, although it still retained the ability to 

10 interact with a variety of other polypeptides, including RB 

To define the relationship between p300/CBP and histone acetyltransferase 
activity, affinity purified El A-binding polypeptides were separated by Mono Q 
ion-exchange column. Both p300/CBP and the acetyltransferase activity were coeluted 
1 5 at 140 mM KC1, while most of polypeptides were eluted at 260 mM KC1. The active 
fraction of Mono Q column (-140 mM KC1) was further separated by Superdex-200 gel 
filtration column. Both p300/CBP and the acetyltransferase activity coeluted after the 
void volume, indicating that p300/CBP is involved in the histone acetyltransferase 
activity. 

20 

p300 is a histone acetyltransferase 

The data provided herein indicate that p300 per se, or a polypeptide(s) 
associated with p300, possesses histone acetyltransferase activity. To test the former 
possibility, the acetyltransferase activity of recombinant p300 was measured. p300 was 
25 divided into three fragments, each of which was expressed in and purified from Sf9 cells 
via a baculovirus expression vector. Histone acetyltransferase activity was readily 
detected in the C-terminal fragment containing amino acids 1 135-2414, whereas no 
activity was found in the other fragments, demonstrating conclusively that p300 per se is 
a histone acetyltransferase. 



30 
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p300/CBP-histone acetyltransferase domain 

To map the histone acetyltransferase domain of p300, a series of deletions 
was prepared. Given the poor conservation of the glutamine-rich region (aa 1815-2414) 
in the C elegans p300/CBP homolog (6), the p300 fragment encoding aa 1 135-1810 
5 was expressed in and purified from E, coli. Importantly, this candidate region of p300 
(aa 1 135-1810) showed significant histone acetyltransferase activity. For further 
mapping within this region, a series of N-terminal deletions was constructed. Deletion 
of 60 residues, resulting in a fragment containing aa 1 195-1810, had no effect on the 
acetyltransferase activity, whereas the deletion of 185 residues, yielding a fragment 
10 comprising aa residues 1320-1810, completely eliminated the acetyltransferase activity. 

Next, a series of C-terminal deletions was analyzed to determine the requirement 
of the P/CAF (or El A) -binding domain. The p300 fragments lacking the El A binding 
domain (aa 1 195-1760, 1 195-1706 and 1 195-1673) still retained the acetyltransferase 

1 5 activity, whereas the further truncated mutant (aa 1 1 95-1 652) completely lost the 

acetyltransferase activity. Consistent with these results, the internal deletion of residues 
1418-1720 showed no acetyltransferase activity. These data demonstrate that the 
histone acetyltransferase domain is located between the bromodomain and the 
El A-binding domain. Given that the histone acetyltransferase domain is highly 

20 conserved between p300 and CBP (91% similarity), the corresponding region of CBP, 
aa residues 1 174-1850, was expressed to confirm the acetyltransferase activity. As 
expected, comparable activity was detected, indicating that both p300 and CBP are 
histone acetyltransferases. 

25 Among various acetyltransferases including histone acetyltransferases GCN5 and 

P/CAF, putative acetyl-CoA binding sites are conserved (48). However, multiple 
alignment analysis (49) showed that the p300/CBP histone acetyltransferase domain 
does not belong to this group. Moreover, comparison of the p300/CBP histone 
acetyltransferase domain with peptide sequence databases (23) showed no sequence 

30 similarity to any other proteins. Accordingly, this invention shows that p300/CBP 

represents a novel class of acetyltransferases in that it does not have the conserved motif 
found among previously described acetyltransferases (48). 
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p300 acetylates all core histones in mononucleosomes 

Substrate specificity for acetylation by p300 was also examined. As substrates, 
histone octamers and mononucleosomes (146 base pairs of DNA wrapped around the 
octamer of core histones) were used. Given that the histone octamer dissociates into 
5 dimers or tetramers under physiological conditions, the histone octamer is referred to 
here as core histones. When core histones were used, p300 acetylated all four proteins, 
but preferentially H3 and H4. More importantly, in a nucleosomal context, p300 
acetylated all four core histones nearly stoichiometrically. In contrast, p300 acetylated 
neither BSA nor lysozyme. 

10 

Hyperacetylated histones are believed to be linked with transcriptionally active 
chromatin (26,27,50,51). Hyperacetylated forms are found in histones H4, H3 and H2B, 
which have multiple acetylation sites in vivo. Thus, the level of acetylation by p300 was 
also tested. 

15 

Mononucleosomes treated with p300 were analyzed by two-dimensional gel 
electrophoresis. A Coomassie blue-stained gel and the corresponding autoradiogram 
showed that a significant amount of histones, especially H4, were hyperacetylated. 
Importantly, acetylation levels by p300 were very close to those of hyperacetylated 
20 histones prepared from HeLa nuclei treated with sodium butyrate, a histone deacetylase 
inhibitor. In contrast, no acetylated forms were detected in the reaction 
without p300. These results indicate that p300 acetylates histones in mononucleosomes 
to the hyperacetylated state by targeting multiple lysine residues. 

25 

p300 acetylates the four lysines in the histone H4 N-terminal tail in vitro which are 
acetylated in vivo 

Lysines at positions 5, 8, 12 and 16 of histone H4 are acetylated in vivo 
(51). Recent studies with yeast histone acetyltransferases demonstrate the 
30 position-specific acetylation by distinct acetyltransferases, i.e., while cytoplasmic 
acetyltransferases for histone deposition and chromatin assembly modify 
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positions 5 and 12, GCN5 modifies positions 8 and 16 (52). Accordingly, the positions 
of acetylation by p300 were also determined. A series of synthetic peptides containing 
acetylated lysines at various positions was used to determine the acetylation 
site-specificity of p300. Consistent with the two-dimensional gel electrophoresis 
5 analysis, the experiments with peptide substrates showed that p300 acetylates all four 
lysines in the histone H4 that are acetylated in vivo. These results are consistent with the 
view that deposition-related diacetylated histones are deacetylated during maturation 
of chromatin (53), 



10 p300 preferentially acetylates the N-terminal histone tail 

Histone acetyltransferases modify specific lysine residues in the N-terminal 
tail of core histones but not the C-terminal globular domain in vivo (26,27,50,5 1). 
Structural models of nucleosomes (54,55,56) suggest that most of the lysine residues in 
the C-terminal globular domain are buried. Therefore, experiments were conducted to 

1 5 examine whether restricted acetylation of the N-terminal tail resulted from the substrate 
specificity of the enzyme or inaccessibility of the enzyme to the core domain in 
nucleosomes. The globular domains of all core histones contain a long helix flanked on 
either side by a loop segment and short helix, termed the "histone fold" (54,55,56). 
The histone fold is involved in formation of the stable H2A-H2B and H3-H4 

20 hetero-dimer s, consisting of extensive hydrophobic contacts between the paired 

molecules. Therefore, it is likely that a histone monomer cannot fold properly, thereby 
increasing access of the histone acetyltransferase to the core domain. Based on these 
considerations, experiments were conducted to determine whether p300 acetylates free 
histone H4 in a N-terminal-specific manner. 

25 

Histone H4 was acetylated with p300 and subsequently the histone tail was 
removed by partial digestion with trypsin. The distributions of radioactivity between 
intact and core histones were compared. While the globular core histone domain was 
predominant at the higher trypsin concentrations, radioactivity was detected mostly in 
30 the intact histone. These data demonstrate that p300 preferentially acetylates the 
N-terminal tail of histone H4. 
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5 m. P/CAF interaction with MyoD 

Tissue culture and transfection experiments 

C 2 C, 2 mouse cells (ATCC accession number CRL 1772) were grown in 
Dulbecco's modified Eagle medium (DMEM) supplemented with 20% fetal bovine 

1 0 serum (FBS) until they reached confluence. Differentiation was induced by switching 
medium to differentiation medium (DM), consisting of DMEM containing 2% horse 
serum. C 3 H/10Tl/2 fibroblasts (ATCC accession number CCL 226) were grown in 
DMEM supplemented with 10% FBS Cells were transfected by the calcium phosphate 
precipitation method. Total amounts of transfected DNA were equalized by empty 

1 5 vector DNA. After 12 h incubation in medium containing the precipitated DNA, the 
cells were washed and incubated in fresh DMEM containing 10% FBS for an additional 
24 h. Afterwards, differentiation was induced by incubating in DM for 36 to 72 h. 
Chloramphenicol acetyltransferase (CAT) assays were performed as previously 
described (64,69). The quantities of cell extracts used for CAT assays were normalized 

20 toQ-galactosidase activity by cotransfection of 1 mg of the p-galactosidase expression 
vector, pON260 

Expression vectors used for transfection experiments are as follows: 
pCX-P/CAF for P/CAF (43); P CMV-bp300 for p300 (65), pCMV-p300 (1869-2414) 
25 (64) and pCMV-p300 (1514-1922) (60) for p300 wild type and mutants; pElA12S, 
pEl A12S R2G, pElA12S D2-36 and pEl A12S D121-130 for El A wild type and 
mutants (66,67,68); and pEMSV-MyoD for MyoD (64). 



30 



The antisense P/CAF RNA expression vector, pcDNA3 P/CAF- AS, was created 
as follows. The 2.5 Kb EcoRI-Kpnl fragment containing the entire P/CAF open reading 
frame was isolated from pCX-P/CAF (43). This fragment was subcloned into the 
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EcoRI-Kpnl sites of plasmid pcDNA3 (Invitrogen) so that the antisense P/CAF RNA is 
driven under the CMV promoter Reporter genes employed were 4RE-CAT and 
MCK-CAT (69) 4RE-CAT is driven by a synthetic promoter containing 4 copies of the 
E-box, whereas MCK-CAT is driven by the native MCK promoter (nucleotides -1256 to 
5 +7). 

Microinjection and immunofluorescence 

Cells were grown on small glass slides, subdivided into numbered squares of 2 
mm x 2 mm and microinjected with purified and concentrated antibodies, as previously 

10 described (70). For immunofluorescence, cells were fixed in either 2% 

paraformaldehyde or 1:2 methanol/acetone solution, preincubated with 5% BSA/PBS 
and incubated with the primary antibodies for 30 min at 37° C. Subsequently, antibody 
was visualized by incubating with either rhodamine- or fluorescein-conjugated secondary 
antibody for 30 min at 37° C Injected antibodies were stained with a 

15 rhodamine-conjugated secondary antibody and nuclei were counter-stained by DAPI as 
previously described (69). 

Antibodies employed are as follows; rabbit polyclonal affinity purified , 
anti-P/CAF antibody (43), rabbit polyclonal anti-p300/CBP antiserum (71), mouse 
20 monoclonal anti-MyoD antibody (clone 5. 8 A, kindly provided by P. Houghton), goat 
polyclonal anti-c-Jun affinity purified antibody (Santa Cruz) and rabbit pre-immune 
serum. 

25 

Immunoprecipitation and DNA affinity purification 

Cells were resuspended in lysis buffer (20 mM NaP0 4 , 150 mM NaCl, 5mM 
MgCl 2 , 0. 1% NP40, 1 mM DTT, 10 mM sodium fluoride, 0. 1 mM sodium vanadate, 1 
mM phenylmethylsulfonyl-fluoride and 10 mg/rnl each of leupeptin, aprotinin and 
30 pepstatin). After 30 min incubation on ice, samples were centrifuged at 12,000 x g for 
30 min and supernatants were used as cell extracts. Extracts were pre-cleared by 
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incubating with rabbit pre-immune serum and protein A/G Plus- Agarose (Santa Cruz) 
for 2 h at 4 C. For immunoprecipitation, the supernatants were incubated with the 
respective antibodies for 3 h at 4 C. Protein A/G Plus- Agarose was added, and 
incubation continued for 3 h. The matrix was washed with lysis buffer, then boiled in 2 
5 X SDS sample buffer. Immunoblotting was performed by using the ECL 

chemiluminescent detection kit (Amersham) according to the manufacturer's protocol. 

Affinity purification of E-box-bound complexes was done as previously 
described (69). Briefly, 100 ng of the biotinylated double stranded DNA containing the 
10 E-box were immobilized on streptavidin-conjugated magnetic beads and incubated with 
500 mg of cell extracts in the presence of poly dl-dC After extensive washing, bound 
proteins were eluted with SDS sample buffer and analyzed by immunoblotting. 

In vitro protein-protein interaction assays 
1 5 The CBP-B fragment and its deletion derivatives were expressed as 

GST-fusions described previously (43), MyoD and El A (43) were expressed as 
FLAG-fiision proteins in Sf9 cells via a baculovirus expression system and 
affinity-purified on M2 anti-FLAG antibody-agarose (Kodak-IBI) Crude E coli 
extracts containing GST-fusions were incubated with various amounts of MyoD and/or 
20 El A in 50 ml of buffer B (20 mM Tris-HCl, pH 8 0, 0 1 M KC1, 5 mM MgCl 2 , 10% 
glycerol, and 0. 1% Nonidet P-40) on ice for 10 min. GST-precipitation was performed 
as described (43). MyoD and El A were detected by immunoblotting with anti-FLAG 
M2 antibody. For the interaction between P/CAF and MyoD, 1.5 pmol of 
FLAG-P/CAF and 15 pmol of FLAG-MyoD were incubated in 50 ml of buffer B on ice 
25 for 10 min. The mixture was further incubated with 2 mg of anti-P/CAF (43) or 
anti-hADA2 antibody for 60 min. The immunocomplexes were precipitated by 
incubation with 10 ml of protein A-Trisacryl (Pierce) and rotated for 1-4 hr at 4oC. The 
matrix was washed 4 times with 200 ml of buffer B and boiled in 10 ml of 2 X SDS 
sample buffer. The proteins were resolved on a 4%-20% gradient SDS-PAGE and 
30 subjected to immunoblotting with the anti-FLAG M2 antibody The blot was developed 
with the SUPERSIGNAL chemiluminescent substrates (Pierce). 
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P/CAF coactivates muscle-specific transcription 

P/CAF and MyoD were co-transfected into mouse C3H10T1/2 fibroblasts, and 
MyoD-mediated transcription was determined from reporter activity driven by the 
5 artificial (4RE) and the naturally-occurring muscle creatine kinase (MCK) promoters. 
Overexpression of P/CAF stimulated MyoD-dependent transcription several folds in 
both promoters. Similar results were obtained for the myoD activated myogenin 
promoter Transcriptional activation was further stimulated by co-transfecting with 
MyoD, P/CAF and p300 expression vectors, suggesting that P/CAF may function by 

10 forming a complex with p300/CBP. Consistent with the lack of DNA binding capacity in 
P/CAF, overexpression of P/CAF alone did not increase the basal transcriptional activity 
of either enhancer. To test whether P/CAF and p300/CBP function in the same pathway, 
two dominant negative forms of p300 were employed which specifically inhibit 
p300/CBP-mediated transcription (60,64). The p300 segment spanning residues 

15 15 14-1922 inhibits the MyoD-dependent activation via direct interaction with MyoD 
(60), whereas the p300 segment spanning residues 1869-2414 inhibit it without direct 
interaction (64). Both dominant negative mutants inhibited MyoD-coactivation by 
P/CAF), suggesting that P/CAF and p300/CBP function in the same pathway 

20 For further elucidation of the activation mechanism by P/CAF, the effect of El A, 

which inhibits MyoD-dependent transcription and differentiation (66,72,73) via direct 
interaction with p300/CBP (65,78), was tested. Expression of El A in C3H10T1/2 
fibroblasts inhibited stimulation of MyoD-directed transcription by P/CAF 
overexpression. El A mutants lacking p300/CBP-binding activity, El A D2-36 and El A 

25 R2G (67,79), had almost no effect. On the other hand, an El A mutant retaining 
p300/CBP-binding activity, El A D121-130, behaved like the wild type. Since El A 
associates with p300/CBP, but not with P/CAF, these results suggest that P/CAF 
functions in MyoD-directed transcription via interaction with p300/CBP. 

30 To address the role of P/CAF as a myogenic coactivator in a more relevant 

environment, P/CAF was overexpressed in proliferating C2C12 myoblasts which express 
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endogenous myogenic bHLH factors. As observed in fibroblasts, overexpression of 
P/CAF stimulated muscle specific transcription. Concomitant expression of exogenous 
p300 increased P/CAF-mediated coactivation. The repression exerted by wild type El A, 
but not mutant El A D2-36, on P/CAF coactivation of MyoD was also observed in 
5 muscle cells. 

Similar experiments were performed with myogenic cell lines that were stably 
transformed with wild type or mutant El A-expressing vectors (66). Coactivation by 
P/CAF was inhibited by wild type El A or the El A mutant that retains 
10 p300/CBP-binding activity (El AA121-130). In contrast, El A mutants that lack 

p300/CBP-binding (El A A2-36 and El A R2G) allowed transcriptional coactivation by 
P/CAF. Taken together, these experiments show that P/CAF coactivates MyoD-directed 
transcription via interaction with p300/CBP. 

15 P/CAF stimulates myogenic differentiation 

Given that P/CAF potentiates MyoD-directed transcription, the ability of P/CAF 
to assist MyoD in promoting myogenic differentiation was investigated. To this aim, 
C3H10T1/2 fibroblasts were transiently transfected with P/CAF and MyoD expression 
vectors. An expression vector for the green fluorescent protein (GFP) was 

20 co-transfected to identify transfected cells. After incubation in differentiation medium, 
the myogenic conversion of transfected cells was determined by simultaneous expression 
of the GFP and the differentiation-specific marker myosin heavy chain (MHC). Forced 
expression of MyoD in fibroblasts caused muscle differentiation in 12% of the 
transfected fibroblasts. This myogenic conversion was 20% by co-expressing MyoD and 

25 P/CAF. As observed in transcription experiments, stimulation of differentiation by 
P/CAF was counteracted by co-transfection with the p300 dominant negative mutant, 
p300 (1869-2414). Consistent with a general role for coactivators, overexpression of 
P/CAF alone was unable to differentiate fibroblasts. 

30 Similar experiments were done using proliferating C2C 12 myoblasts in which the 

differentiation program is already committed. Most of the myoblasts differentiated into 
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myotubes by overexpressing P/CAF, whereas only a modest effect was observed by 
overexpressing p300. In contrast, differentiation was inhibited slightly by overexpressing 
c-Jun. This inhibitory effect presumably was caused by titration of p300/CBP, which 
associates directly with c-Jun (74). A similar inhibition was observed in the p300 
5 dominant negative mutant. Consistent with the transcriptional effect, El A almost 
completely inhibited differentiation. The El A mutant RG2, lacking p300/CBP-binding 
capability but retaining the retinoblastoma protein (Rb)-binding capability, only partially 
inhibited differentiation, although this same mutant 

inhibited transcription as severely as the wild type. Taken together, these data show that 
10 P/CAF stimulates muscle differentiation by coactivating MyoD function via p300/CBP 
association. 

P/CAF is essential for myogenic transcription and differentiation 

To test the necessity of P/CAF for myogenic transcription, experiments were 
15 conducted whereby P/CAF synthesis was inhibited by expressing antisense P/CAF RNA 
A vector from which the P/CAF mRNA is transcribed in the antisense orientation 
(P/CAF- AS) was transfected with P/CAF and MyoD expression vectors into fibroblasts 
and MyoD-dependent transcription was examined. Cotransfection of the antisense 
expression vector strongly inhibited MyoD-dependent transcription below the level of 
20 induction elucidated by MyoD alone, demonstrating that expression of P/CAF antisense 
RNA inhibits not only the coactivation exerted by exogenous P/CAF but also that of 
endogenous P/CAF. These results indicate that P/CAF is essential for MyoD-dependent 
transcription. 

25 Studies were also carried out to determine whether expression of P/CAF 

antisense RNA inhibits myogenic differentiation. C3H10T1/2 fibroblasts were transiently 
transfected with various expression vectors with or without the P/CAF antisense RNA 
expression vector. Expression of P/CAF antisense RNA reduced MyoD-mediated 
myogenic conversion of fibroblasts. Expression of P/CAF antisense RNA also 

30 counteracted the stimulatory effect of both P/CAF and p300 on myogenic 

differentiation. These data support the view that P/CAF and p300/CBP coactivate 
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MyoD-dependent transcription in the same pathway. More drastic inhibition was 
observed in C2C12 myoblasts in similar experiments. Therefore, it can be concluded that 
P/CAF is essential for transcription of muscle specific genes and hence differentiation 
into myotubes. 

5 

To further confirm the essential role of P/CAF for myogenic differentiation, we 
blockage experiments by antibody microinjection were performed. Antibodies were 
injected into the cytoplasm of proliferating C2C 1 2 myoblasts to prevent the nuclear 
transport of newly synthesized target proteins. After incubating in the differentiation 

1 0 medium, the degree of differentiation was determined. Microinjection of an anti-P/C AF 
antibody almost completely inhibited differentiation. Similar results were obtained by 
microinjecting anti-p300/CBP antibodies. Although microinjection of either 
anti-p300/CBP or P/CAF antibody was sufficient to inhibit differentiation, an even 
greater inhibition was observed by coinjecting both of them. Microinjection of 

15 anti-P/CAF or anti-p300/CBP antibody did not interfere with induction of p53 by DNA 
damaging agents, showing specificity of the inhibition by the antibodies. In contrast to 
anti-P/CAF or anti-p300/CBP antibodies, the injection of anti-MyoD antibody only 
partially inhibited differentiation, supporting the view of functional redundancy between 
MyoD and Myf-5 (75,76). Injection of anti-c-Jun antibody or control antibody did not 

20 interfere with muscle differentiation. 



Similar experiments were performed with C3H10T1/2 fibroblasts stably 
expressing MyoD. In these cells, either anti-p300/CBP or anti-P/CAF antibody 
completely inhibited muscle differentiation. In contrast to myoblasts, anti-MyoD 
25 antibody completely blocked differentiation in the fibroblasts expressing MyoD. 

Anti-c-Jun and control antibodies did not interfere with differentiation. Taken together, 
these results demonstrate that P/CAF and p300/CBP are indispensable for activation of 
the myogenic program. 



30 
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p300/CBP, P/CAF and MyoD form a multimeric complex in vivo 

The data described above indicate that P/CAF stimulates MyoD-directed 
transcription via association with p300/CBP Thus, experiments were conducted to 
investigate whether P/CAF, p300/CBP and MyoD could associate in a complex. 
5 First, cellular extracts derived from C2C12 myotubes were subjected to 

immunoprecipitation. Both anti-MyoD and anti-p300/CBP antibodies co-precipitated 
P/CAF. In a complementary experiment, both anti-p300/CBP and anti-P/CAF 
antibodies also co-precipitated MyoD, suggesting that these factors form a multimeric 
protein complex in myotubes 

10 

Next, attempts were made to detect this complex on the E-box, the DNA 
binding site for MyoD. Immobilized DNA containing an E-box sequence was incubated 
with myotube extracts. After extensive washing, P/CAF, p300/CBP and MyoD were 
analyzed by immunoblotting P/CAF, p300/CBP and MyoD were all affinity purified on 
1 5 the immobilized DNA, whereas they were not purified on the control DNA lacking the 
E-box. Given that P/CAF and p300/CBP per se cannot bind to DNA, these observations 
indicate that P/CAF and p300/CBP are recruited through MyoD at the E-box sites to 
form a multi-protein complex. 

20 Complex formation is inhibited by viral transforming factors 

Since the oncoviral proteins El A and large T antigen inhibit myogenic 
transcription and differentiation, the effect of these factors on the formation of 
complexes on the E-box was tested. Importantly, very small amouts of P/CAF and 
p300/CBP were co-purified on the E-box from myocyte extracts which stably express 
25 El A or large T antigen, although MyoD was detected under these conditions. The lower 
recovery of MyoD from ElA-expressing muscle cells could reflect the low level of 
MyoD in the extracts (66). These results indicate that El A and large T antigen 
dissociate P/CAF and p300/CBP from MyoD without altering MyoD binding to DNA 

30 Consistent with the previous observations that transiently expressed El A 

prevents interaction between P/CAF and p300/CBP in vivo (43), the association 
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between p300/CBP and P/CAF was abolished in myoblasts stably transformed by wild 
type El A but not in those clones transformed with the El A mutant R2G unable to bind 
p300/CBP. Similarly, the interaction between p300/CBP and P/CAF was abolished by 
large T antigen but not by the mutant protein that localizes into the cytoplasm (77). 

5 

Interaction between MyoD, P/CAF and CBP in vitro 

Previous interaction experiments in vitro indicate that the CBP region spanning 
residues 1801 to 1850 is crucial for interaction with both P/CAF and El A (43). While 
most sequence-specific factors bind to CBP sites distinct from the P/CAF/E1 A binding 

10 sites, MyoD interacts with an overlapping CBP fragment called the CH3 region 

(60,64,65). To understand how P/CAF, p300/CBP and MyoD associate, the CBP sites 
important for MyoD binding were mapped more precisely. Consistent with previous 
reports (60,64,65), the CBP fragment spanning residues 1801-2000 (fragment B) bound 
MyoD. Moreover, deletion of residues 1801 to 1850 within fragment B completely 

1 5 abolished interaction with MyoD, which is similar to the results obtained with P/CAF 
and El A. Importantly, an internal deletion of residues 1850-1878 abolished the MyoD 
interaction with CBP, while it did not affect binding of El A or P/CAF (43). These 
results suggest that MyoD and P/CAF bind to distinct sites of p300/CBP, albeit the 
binding sites may overlap. Moreover, a direct interaction was observed between MyoD 

20 and P/CAF, which may contribute to stabilization of the multimeric complex. 

These data show that El A prevents not only p300/CBP-interaction with 
P/CAF but also that with MyoD in vivo. To obtain evidence that this 
inhibition is due to the direct action by El A, competition experiments were performed 
25 in vitro. Importantly, the interaction between CBP and MyoD was strongly inhibited by 
addition of El A, implicating that El A inhibits myogenic transcription by disrupting 
multiple interactions. 



Although the present process has been described with reference to specific 
details of certain embodiments thereof, it is not intended that such details should be 



WO 98/03652 PCT/US97/12877 

65 

regarded as limitations upon the scope of the invention except as and to the extent that 
they are included in the accompanying claims. 



Throughout this application various publications are referenced by numbers 
5 within parentheses. Full citations for these publications are as follows. The disclosures 
of these publications in their entireties are hereby incorporated by reference into this 
application in order to more fully describe the state of the art to which this invention 
pertains. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i)' APPLICANT: The United States of America, as repesented by the 
Secretary, Department of Health and Human Services, c/o 
National Institutes of Health, Office of Technology Transfer, 
6011 Executive Boulevard, Suite 325, Rockville, Maryland 20842 

(ii) TITLE OF THE INVENTION: METHODS AND COMPOSITIONS FOR 

p300/CBP-ASSOCIATED TRANSCRIPTIONAL CO- FACTOR P/CAF 

(iii) NUMBER OF SEQUENCES: 18 

(iv) CORRESPONDENCE ADDRESS: 

<A) ADDRESSEE: NEEDLE & ROSENBERG, P.C 

(B) STREET: Suite 1200, 127 Peachtree Street, NE 

(C) CITY: Atlanta 

(D) STATE: GA 

(E) COUNTRY: USA 

(F) ZIP : 30303 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ . for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 23-JUL-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: Corresponding U.S. Serial No. 60/022,27, 

(B) FILING DATE: 23-July-1996 



(viii) ATTORNEY /AGENT INFORMATION: 
< A) NAME : Miller, Mary L 

(B) REGISTRATION NUMBER: 39,303 

(C) REFERENCE /DOCKET NUMBER: 14014. 0238/P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404/688-0770 

(B) TELEFAX: 404/688-9880 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 832 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: None 
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(xi) SEQUENCE DESCRIPTION: 



Met 


Ser 


Glu 


Ala 


Gly 


Gly 


Ala 


Gly 


1 








5 








Gly 


Ala 


Gly 


Ala 


Gly 


Pro 


Gly 


Ala 






20 










Pro 


Pro 


Ala 


Pro 


Pro 


Gin 


Gly 


Ser 






35 










40 


Ser 


Gly 


Ala 


Cys 


Gly 


Pro 


Ala 


Thr 




50 










55 




Glu 


Gly 


Pro 


Gly 


Gly 


Gly 


Gly 


Ser 


65 










70 






Gin 


Leu 


Arg 


Ser 


Ala 


Pro 


Arg 


Ala 










85 








Tyr 


Ser 


Ala 


Cys 


Lys 


Ala 


Glu 


Glu 








100 










Asn 


Pro 


Asn 


Pro 


Ser 


Pro 


Thr 


Pro 






115 










120. 


He 


Val 


Ser 


Leu 


Thr 


GlU 


Ser 


Cys 




130 










135 




Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


145 










150 






Leu. 


Leu 


Gly 


He 


Val 


Leu 


Asp 


Val 










165 








Lys 


GlU 


Glu 


Asp 


Ala 


Asp 


Thr 


Lys 








180 










Leu 


Leu 


Arg 


Lys 


Ser 


He 


Leu 


Gin 






195 










200 


Ser 


Leu 


Glu 


Lys 


Lys 


Pro 


Pro 


Phe 




210 










215 




Val 


Asn 


Asn 


Phe 


Val 


Gin 


Tyr 


Lys 


225 










230 






Ara 


Gin 


Thr 


He 


Val 


Glu 


Leu 


Ala 










245 








Tvr 


Tro 


His 


Leu 


Glu 


Ala 


Pro 


Ser 






260 










Asp 


Asp 


He 


Ser 


Glv 


Tvr 


Lvs 


Glu 






275 










280 


Cys 


Asn 


Val 


Pro 


Gin 


Phe 


Cvs 


Asp 




290 










295 




Gin 


Val 


Phe 


Gl V 


Arg 


Thr 


Leu 


Leu 


305 










310 








Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 










325 








GlU 


Lys 


Arg 


Thr 


Leu 


He 


Leu 


Thr 








340 










Leu 


Glu 


Glu 


Glu 


Val 


Tvr 


Ser 


Gin 






355 










360 


Phe 


Leu 


Ser 


Ala 


Ser 


Ser 


Arg 


Thr 




370 










375 




He 


Asn 


Pro 


Pro 


Pro 


Val 


Ala 


Gly 


385 










390 






Ser 


Ser 


Leu 


Glu 


Gin 


Pro 


Asn 


Ala 










405 








Ala 


Ser 


Ser 


Gly 


Leu 


Glu 


Ala 


Asn 








420 










Asp 


Ser 


His 


Val 


Leu 


Glu 


Glu 


Ala 






435 










440 


He 


Pro 


Met 


Glu 


Leu 


He 


Asn 


Glu 




4 50 










455 




Ala 


Ala 


Met 


Leu 


Gly 


Pro 


Glu 


Thr 



465 470 
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SEQ ID NO: 1 : 



Pro 


Gly 


Gly 


Cys 


Gly 


Ala 


Gly 


Ala 




10 










15 




Leu 


Pro 


Pro 


Gin 


Pro 


Ala 


Ala 


Leu 


25 










30 






Pro 


Cys 


Ala 


Ala 


Ala 


Ala 


Gly 


Gly 










45 








Ala 


Val 


Ala 


Ala 


Ala 


Gly 


Thr 


Ala 








60 










Ala 


Arg 


He 


Ala 


Val 


Lys 


Lys 


Ala 






75 










80 


Lys 


Lys 


Leu 


Glu 


Lys 


Leu 


Gly 


Val 




90 










95 




Ser 


Cys 


Lys 


Cys 


Asn 


Gly 


Trp 


Lys 


105 










110 






Pro 


Arg 


Ala 


Asp 


Leu 


Gin 


Gin 


He 










125 








Arg 


Ser 


Cys 


Ser 


His 


Ala 


Leu 


Ala 








140 










Val 


Ser 


Glu 


Glu 


Glu 


Met 


Asn 


Arg 






155 










160 


Glu 


Tyr 


Leu 


Phe 


Thr 


Cys 


Val 


His 




170 










175 




Gin 


Val 


Tyr 


Phe 


Tyr 


Leu 


Phe 


Lys 


185 










190 






Arg 


Gly 


Lys 


Pro 


Val 


Val 


Glu 


Gly 










205 








Glu 


Lys 


Pro 


Ser 


He 


Glu 


Gin 


Gly 








220 










Phe 


Ser 


His 


Leu 


Pro 


Ala' 


Lys 


Glu 






235 










240 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 


He 


Asn 


250 










255 




Gin 


Arq 


Arq 


Leu 


Arq 


Ser 


Pro 


Asn 


265 










270 






Asn 


Tyr 


Thr 


Arg 


Trp 


Leu 


Cys 


Tyr 










285 








Ser 


Leu 


Pro 


Arq 


Tvr 


Glu 


Thr 


Thr 








300 










Ara 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 






315 










320 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 




330 










335 




His 


Phe 


Pro 


Lys 


Phe 


Leu 


Ser 


Met 


345 










350 






Asn 


Ser 


Pre 


He 


Trp 


Asp 


Gin 


Asp 










365 








Ser 


Gin 


Leu 


Gly 


He 


Gin 


Thr 


Val 








380 










Thr 


He 


Ser 


Tvr 


Asn 


Ser 


Thr 


Ser 






395 










400 


Gly 


Ser 


Ser 


Ser 


Pro 


Ala 


Cys 


Lys 




410 










415 




Pro 


Gly 


Glu 


Lys 


Arg 


Lys 


Met 


Thr 


425 










430 






Lys 


Lys 


Pro 


Arg 


Val 


Met 


Gly 


Asp 










445 








Val 


Met 


Ser 


Thr 


He 


Thr 


Asp 


Pro 








460 










Asn 


Phe 


Leu 


Ser 


Ala 


His 


Ser 


Ala 






475 










480 
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Arg Asp Glu Ala Ala Arg Leu Glu Glu Arg Arg Gly Val lie Glu Phe 

485 490 495 

His Val Val Gly Asn Ser Leu Asn Gin Lys Pro Asn Lys Lys lie Leu 

500 505 510 

Met Trp Leu Val Gly Leu Gin Asn Val Phe Ser His Gin Leu Pro Arg 

515 520 525 

Met Pro Lys Glu Tyr lie Thr Arg Leu Val Phe Asp Pro Lys His Lys 

530 535 540 

Thr Leu Ala Leu lie Lys Asp Gly Arg Val He Gly Gly lie Cys Phe 
545 550 555 560 

Arg Met Phe Pro Ser Gin Gly Phe Thr Glu He Val Phe Cys Ala Val 

565 570 575 

Thr Ser Asn Glu Gin Val Lys Gly Tyr Gly Thr His Leu Met Asn His 

580 585 590 

Leu Lys Glu Tyr His He Lys His Asp He Leu Asn Phe Leu Thr Tyr 

595 600 605 

Ala Asp Glu Tyr Ala He Gly Tyr Phe Lys Lys Gin Gly Phe Ser Lys 

610 615 620 

Glu He Lys He Pro Lys Thr Lys Tyr Val Gly Tyr He Lys Asp Tyr 
625. 630 635 640 

Glu Gly Ala Thr Leu Met Gly Cys Glu Leu Asn Pro Arg He Pro Tyr 

645 650 655 

Thr Glu Phe Ser Val He He Lys Lys Gin Lys Glu He He Lys Lys 

660 665 670 

Leu He Glu Arg Lys Gin Ala Gin He Arg Lys Val Tyr Pro Gly Leu 

675 680 6B5 

Ser Cys Phe Lys Asp Gly Val Arg Gin He Pro He Glu Ser He Pro 

690 695 700 

Glv He Arg Glu Thr Gly Trp Lys Pro Ser Gly Lys Glu Lys Ser Lys 
• 705 710 715 720 

Glu Pro Arg Asp Pro Asp Gin Leu Tyr Ser Thr Leu Lys Ser He Leu 

725 730 735 

Gin Gin Val Lys Ser His Gin Ser Ala Trp Pro Phe Met Glu Pro Val 

740 745 750 

Lvs Arg Thr Glu Ala Pro Gly Tyr Tyr Glu Val He Arg Ser Pro Met 

755 760 765 

Asp Leu Lys Thr Met Ser Glu Arg Leu Lys Asn Arg Tyr Tyr Val Ser 

770 775 780 

Lys Lys Leu Phe Met Ala Asp Leu Gin Arg Val Phe Thr Asn Cys Lys 
785 790 795 800 

Glu Tyr Asn Ala Pro Glu Ser Glu Tyr Tyr Lys Cys Ala Asn He Leu 

805 810 815 

Glu Lys Phe Phe Phe Ser Lys He Lys Glu Ala Gly Leu He Asp Lys 
820 825 830 

(2) INFORMATION FOR SEQ ID NO : 2 : 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Leu Glu Glu Glu Val Tyr Ser Gin Asn Ser Pro He Trp Asp Gin 

1 5 10 15 

Asp Phe Leu Ser Ala Ser Ser Arg Thr Ser Gin Leu Gly He Gin Thr 

20 25 30 

Val He Asn Pro Pro Pro Val Ala Gly Thr He Ser Tyr Asn Ser Thr 
35 40 45 
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Ser 


Ser 


Ser 


Leu 


Glu 


Gin 


Pro 


Asn 




50 










55 




Lys 


Ala 


Ser 


Ser 


Gly 


Leu 


Glu 


Ala 


65 










70 






Thr 


Asp 


Ser 


His 


Val 


Leu 


Glu 


Glu 










85 








Asp 


He 


Pro 


Met 


Glu 


Leu 


He 


Asn 








100 










Pro 


Ala 


Ala 


Met 


Leu 


Gly 


Pro 


Glu 






115 










120 


Ala 


Arg 


Asp 


Glu 


Ala 


Ala 


Arg 


Leu 




130 










135 




Phe 


His 


Val- 


Val 


Gly 


Asn 


Ser 


Leu 


145 










150 






Leu 


Met 


Trp 


Leu 


Val 


Gly 


Leu 


Gin 










165 








Arg 


Met 


Pro 


Lys 


Glu 


Tyr 


He 


Thr 








180 










Lys 


Thr 


Leu 


Ala 


Leu 


He 


Lys 


Asp 






195 










200 


Phe 


Arg 


Met 


Phe 


Pro 


Ser 


Gin 


Gly 




210 










215 




Val 


Thr 


Ser 


Asn 


Glu 


Gin 


Val 


Lys 


225 










230 




His 


Leu 


Lys 


Glu 


Tyr 


His 


He 


Lys 










245 








Tyr 


Ala 


Asp 


Glu 


Tyr 


Ala 


lie 


Gly 








260 










Lvs 


Glu 


He 


Lvs 


He 


Pro 


Lvs 


Thr 






275 










280 


Tyr 


Glu 


Gly 


Ala 


Thr 


Leu 


Met 


Gly 


290 










295 




Tyr 


Thr 


Glu 


Phe 


Ser 


Val 


He 


He 


305 










310 






Lys 


Leu 


He 


Glu 


Arg 


Lys 


Gin 


Ala 










325 








Leu 


Ser 


Cys 


Phe 


Lys 


Asp 


Gly 


Val 








340 










Pro 


Glv 


He 


Ara 


Glu 


Thr 


Glv 








355 










360 


Lvs 


Glu 


Pro 


Ara 


Asp 


Pro 


Aso 


Gin 




370 










375 




Leu 


Gin 


Gin 


Val 


LVS 


Ser 


His 


Gin 


385 










390 






Val 


Lvs 


Arc 


Thr 


Glu 


Ala 


Pro 


Glv 










405 








Met 


Asp 


Leu 


Lys 


Thr 


Met 


Ser 


Glu 








420 










Ser 


Lys 


Lys 


Leu 


Phe 


Met 


Ala 


Asp 






435 










440 


Lys 


Glu 


Tyr 


Asn 


Ala 


Pro 


Glu 


Ser 




450 










455 




Leu 


Glu 


Lys 


Phe 


Phe 


Phe 


Ser 


Lys 


465 










470 







Lys 
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Ala 


Gly 


Ser 


Ser 

60 


Ser 


Pro 


Ala 


Cys 


Asn 


Pro 


Gly 
75 


Glu 


Lys 


Arg 


Lys 


Met 
80 


Ala 


Lys 

90 


Lys 


Pro 


Arg 


Val 


Met 

95 


Gly 


Glu 


Val 


Met 


Ser 


Thr 


lie 


Thr 


Asp 


105 










110 






Thr 


Asn 


Phe 


Leu 


Ser 
125 


Ala 


His 


Ser 


Glu 


Glu 


Arg 


Arg 

140 


Gly 


Val 


lie 


Glu 


Asn 


Gin 


Lys 
155 


Pro 


Asn 


Lys 


Lys 


lie 
160 


Asn 


Val 

170 


Phe 


Ser 


His 


Gin 


Leu 

175 


Pro 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


185 










190 






Gly 


Arg 


Val 


He 


Gly 
205 


Gly 


lie 


Cys 


Phe 


Thr 


Glu 


lie 
220 


Val 


Phe 


Cys 


Ala 


Gly 


Tyr 


Gly 
235 


Thr 


His 


Leu 


Met 


Asn 
240 


His 


Asp 

250 


lie 


Leu 


Asn 


Phe 


Leu 

255 


Thr 


Tyr 


Phe 


Lys 


Lys 


Gin 


Gly 


Phe 


Ser 


265 










270 






Lvs 


Tvr 


Val 


Glv 


Tvr 
285 


lie 


LVS 


ASD 


Cys 


Glu 


Leu 


Asn 

300 


Pro 


Arg 


lie 


Pro 


Lys 


Lys 


Gin 
315 


Lys 


Glu 


lie 


lie 


Lys 
320 


Gin 


He 
330 


Arg 


Lys 


Val 


Tyr 


Pro 
335 


Gly 


Arg 


Gin 


He 


Pro 


lie 


Glu 


Ser 


lie 


345 










350 






Lys 


Pro 


Ser 


Gly 


Lvs 

365 


Glu 


Lvs 


Ser 


Leu 


Tyr 


Ser 


Thr 
380 


Leu 


Lys 


Ser 


lie 


Ser 


Ala 


Trp 

395 


Pro 


Phe 


Met 


Glu 


Pro 
400 


Tyr 


Tyr 
410 


Glu 


Val 


lie 


Arg 


Ser 
415 


Pro 


Arg 


Leu 


Lys 


Asn 


Arg 


Tyr 


Tyr 


Val 


425 










430 






Leu 


Gin 


Arg 


Val 


Phe 
445 


Thr 


Asn 


Cys 


Glu 


T y r 


Tyr 


Lys 
460 


Cys 


Ala 


Asn 


lie 


lie 


Lys 


Glu 
475 


Ala 


Gly 


Leu 


He 


Asp 
480 



(2) INFORMATION FOR SEQ ID NO : 3 ; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 203 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



Arg 

1 


Val 


Val 


Gin 


His 


Thr 


Lys 


Gly 


Cys 


Lys 
10 


Arg 


Lys 


Thr 


Asn 


Gly 
15 


Gly 


Cys 


Pro 


He 




Lys 


Gin 


Leu 


He 


Ala 


Leu 


Cys 


Cys 


Tyr 


His 


Ala 


Lys 
















25 










30 






His 


Cys 


Gin 


bill 


Asn 


Lys 




P CO 


Val 


Pro 


Phe 


Cvs 


Leu 


Asn 


He 


Lys 




35 




















45 






Gin 


Gin 


Lys 


Leu 


Arg 


Gin 


Gin 


Gin 


Leu 




Hie: 


Arg 


Leu 


Gin 


Gin' 


Ala 




50 








DO 










60 








Gin 


Met 


Leu 


Arg 


Arg 


Arg 


Mat- 


ru. cz 




Met 




Thr 


Glv 


Val 


Val 


Gly 


65 






f U 










7 5 










80 


Gin 


bin 


ui y 


i*eu 


Pro 


Ser 


Pro 


Thr 


Pro 


Ala 


Thr 


Pro 


Thr 


Thr 


Pro 


Thr 






85 










90 










95 




Gly 


Gin 


Gin 


Pro 


Thr 


Thr 


Pro 


Gin 


Thr 


Pro 


Gin 


Pro 


Thr 


Ser 


Gin 


Pro 






100 










105 










110 


Thr 


Gin 


Gin 


Pro 


Thr 
115 


Pro 


Pro 


Asn 


Ser 


Met 
120 


Pro 


Pro 


Tyr 


Leu 


Pro 

125 


Arg 


Ala 


Ala 


Gly 


Pro 


Val 


Ser 


Gin 


Gly 


Lys 


Ala 


Ala 


Gly 


Gin 


Val 


Thr 


Pro 




130 








135 










140 


Gly 




Pro 


Pro 


Pro 


Thr 


Pro 


Pro 


Gin 


thr 


Ala 


Gin 


Pro 


Pro 


Leu 


Pro 


Pro 


145 










150 










155 








Thr. 


160 


Thr 


Ala 


Val 


Glu 


Met 


Ala 


Met 


Gin 


He 


Gin 


Arg 


Ala 


Ala 


Glu 


Gin 






165 










170 










175 


Gin 


Arg 


Gin 


Met 


Ala 


His 


Val 


Gin 


He 


Phe 


Gin 


Arg 


Pro 


He 


Gin 


His 






180 










185 










190 






Met 


Pro 


Pro 


Met 


Thr 


Pro 


Met 


Ala 


Pro 


Met 


Gly 













195 200 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Ser Glu Ala Gly Gly Ala Gly Pro Gly Gly Cys Gly Ala Gly Ala 

! 5 10 IS 

Gly Ala Gly Ala Gly Pro Gly Ala Leu Pro Pro . Gin Pro Ala Ala Leu 

20 25 30 

Pro Pro Ala Pro Pro Gin Gly Ser Pro Cys Ala Ala Ala Ala Gly Gly 

35 40 45 . 

Ser Gly Ala Cys Gly Pro Ala Thr Ala Val Ala Ala Ala Gly Thr Ala 

50 55 60 

Glu Glv Pro Gly Gly Gly Gly Ser Ala Arg He Ala Val Lys Lys Ala 
65 70 75 80 

Gin Leu Arg Ser Ala Pro Arg Ala Lys Lys Leu Glu Lys Leu Gly Val 

85 90 95 

Tyr Ser Ala Cys Lys Ala Glu Glu Ser Cys Lys Cys Asn Gly Trp Lys 

100 105 HO 

Asn Pro Asn Pro Ser Pro Thr Pro Pro Arg Ala Asp Leu Gin Gin He 

115 120 125 

He Val Ser Leu Thr Glu Ser Cys Arg Ser Cys Ser His Ala Leu Ala 
130 135 140 
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Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 


Glu 


Glu 


Glu 


Met 


Asn 


Arg 


145 










150 










155 










160 


Leu 


Leu 


Gly 


He 


Val 

165 


Leu 


Asp 


Val 


Glu 


Tyr 
170 


Leu 


Phe 


Thr 


Cys 


Val 
175 


His 


Lys 


Glu 


Glu 


Asp 
180 


Ala 


Asp 


Thr 


Lys 


Gin 
185 


Val 


Tyr 


Phe 


Tyr 


Leu 

190 


Phe 


Lys 


Leu 


Leu 


Arg 

195 


Lys 


Ser 


He 


Leu 


Gin 

200 


Arg 


Gly 


Lys 


Pro 


Val 

205 


Val 


Glu 


Gly 


Ser 


Leu 
210 


Glu 


Lys 


Lys 


Pro 


Pro 
215 


Phe 


Glu 


Lys 


Pro 


Ser 
220 


He 


Glu 


Gin 


Gly 


Val 


Asn 


Asn 


Phe 


Val 


Gin 


Tyr 


Lys 


Phe 


Ser 


His 


Leu 


Pro 


Ala 


Lys 


Glu 


225 










230 










235 










240 


Arcr 


Gin 


Thr 


He 


Val 
245 


Glu 


Leu 


Ala 


Lys 


Met 

250 


Phe 


Leu 


Asn 


Arg 


He 
255 


Asn 


Tyr 


Trp 


His 


Leu 

260 


Glu 


Ala 


Pro 


Ser 


Gin 
265 


Arg 


Arg 


Leu 


Arg 


Ser 
270 


Pro 


Asn 


Asp 


Asp 


He 
275 


Ser 


Gly 


Tyr 


Lys 


Giu 
280 


Asn 


Tyr 


Thr 


Arg 


Trp 

285 


Leu 


Cys 


Tyr 


Cys 


Asn 
290 


Val 


Pro 


Gin 


Phe 


Cys 
295 


Asp 


Ser 


Leu 


Pro 


Arg 

300 


Tyr 


Glu 


Thr 


Thr 


Gin 


Val 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 


305 










310 










315 










320 


Arg . 


Gin 


Leu 


Leu 


Glu 
325 


Gin 


Ala 


Arg 


Gin 


Glu 
330 


Lys 


Asp 


Lys 


Leu 


Pro 
335 


Leu 


Glu 


Lys 


Arg 


Thr 
340 


Leu 


He 


Leu 


Thr 


His 
345 


Phe 


Pro 


Lys 


Phe 


Leu 
350 


Ser 





(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 476 amino acids 

(B) TYPE:- amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



Met 


Leu 


Glu 


Glu 


Glu 


lie 


Tyr 


Gly 


Ala 


Asn 


Ser 


Pro 


He 


Trp 


Glu 


Ser 


1 








5 










10 










15 




Gly 


Phe 


Thr 


Met 


Pro 


Pro 


Ser 


Glu 


Gly 


Thr 


Gin 


Leu 


Val 


Pro 


Arg 


Pro 








20 










25 










30 






Ala 


Ser 


Val 


Ser 


Ala 


Ala 


Val 


Val 


Pro 


Ser 


Thr 


Pro 


lie 


Phe 


Ser 


Pro 






35 










40 










45 








Ser 


Met 


Gly 


Gly 


Gly 


Ser 


Asn 


Ser 


Ser 


Leu 


Ser 


Leu 


Asp 


Ser 


Ala 


Gly 




50 










55 










60 










Ala 


Glu 


Pro 


Met 


Pro 


Gly 


Glu 


Lys 


Arg 


Thr 


Leu 


Pro 


Glu 


Asn 


Leu 


Thr 


65 










70 










75 










80 


Leu 


Glu 


Asp 


Ala 


Lys 


Arg 


Leu 


Arg 


val 


Met 


Gly 


Asp 


He 


Pro 


Met 


Glu 










85 










90 










95 




Leu 


Val 


Asn 


Glu 


Val 


Met 


Leu 


Thr 


He 


Thr 


Asp 


Pro 


Ala 


Ala 


Met 


Leu 








100 










105 










110 






Gly 


Pro 


Glu 


Thr 


Ser 


Leu 


Leu 


Ser 


Ala 


Asn 


Ala 


Ala 


Arg 


Asp 


Glu 


Thr 






115 










120 










125 








Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


He 


He 


Glu 


Phe 


His 


Val 


He 


Gly 




130 










135 










140 










Asn 


Ser 


Leu 


Thr 


Pro 


Lys 


Ala 


Asn 


Arg 


Arg 


Val 


Leu 


Leu 


Trp 


Leu 


Val 


145 










150 










155 










160 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 


Glu 










165 










170 










175 




Tyr 


He 


Ala 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


Lys 


Thr 


Leu 


Ala 


Leu 



180 185 190 
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He Lys Asp Gly Arg Val He Gly Gly He Cys Phe Arg Met Phe Pro 

195 200 205 

Thr Gin Gly Phe Thr Glu He Val Phe Cys Ala Val Thr Ser Asn Glu 

210 215 220 

Gin Val Lys Gly Tyr Gly Thr His Leu Met Asn His Leu Lys Glu Tyr 
225 230 235 240 

His He Lys His Asn lie Leu Tyr Phe Leu Thr Tyr Ala Asp Glu Tyr 

245 . 250 255 

Ala He Gly Tyr Phe Lys Lys Gin Gly Phe Ser Lys Asp He Lys Val 

260 265 270 

Pro Lys Ser Arg Tyr Leu Gly Tyr He Lys Asp Tyr Glu Gly Ala Thr 

275 280 285 

Leu Met Glu Cys Glu Leu Asn Pro Arg He Pro Tyr Thr Glu Leu Ser 

290 295 300 

His He He Lys Lys Gin Lys Glu He He Lys Lys Leu He Glu Arg 
305 310 315 320 

Lys Gin Ala Gin He Arg Lys Val Tyr Pro Gly Leu Ser Cys Phe Lys 

325 330 335 

Glu Gly val Arg Gin He Pro Val Glu Ser Val Pro Gly He Arg Glu 

340 345 350 

Thr Gly Trp Lys Pro Leu Gly Lys Glu Lys Gly Lys Glu Leu Lys Asp 

355 360 365 

Pro Asp Gin Leu Tyr Thr Thr Leu Lys Asn Leu Leu AJ.a Gin He Lys 

370 375 380 

Ser His Pro Ser Ala Trp Pro Phe Met Glu Pro Val Lys Lys Ser Glu 
385 390 395 .400 

Ala Pro Asp Tyr Tyr Glu Val He Arg Phe Pro He Asp Leu Lys Thr 

405 410 415 

Met Thr Glu Arg Leu Arg Ser Arg Tyr Tyr Val Thr Arg Lys Leu Phe 

420 425 430 

Val Ala Asp Leu Gin Arg Val He Ala Asn Cys Arg Glu Tyr Asn Pro 

435 440 445 

Pro Asp Ser Glu Tyr Cys Arg Cys Ala Ser Ala Leu Glu Lys Phe Phe 

450 455 460 

Tyr Phe Lys Leu Lys Glu Gly Gly Leu He Asp Lys 
465 470 475 

(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 2414 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Met 


Ala 


Glu 


Asn 


Val 


Val 


Glu 


Pro 


Gly 


Pro 


Pro 


Ser 


Ala 


Lys 


Arg 


Pro 


1 








5 










10 










15 




Lys 


Leu 


Ser 


Ser 


Pro 


Ala 


Leu 


Ser 


Ala 


Ser 


Ala 


Ser 


Asp 


Gly 


Thr 


Asp 






20 










25 










30 




lie 


Phe 


Gly 


Ser 


Leu 


Phe 


Asp 


Leu 


Glu 


His 


Asp 


Leu 


Pro 


Asp 


Glu 


Leu 




35 










40 










45 








Asn 


Ser 


Thr 


Glu 


Leu 


Gly 


Leu 


Thr 


Asn 


Gly 


Gly 


Asp 


He 


Asn 


Gin 


Leu 


50 








55 










60 








Gin 


Gin 


Thr 


Ser 


Leu 


Gly 


Met 


Val 


Gin 


Asp 


Ala 


Ala 


Ser 


Lys 


His 


Lys 


65 








70 










75 










80 


Leu 


Ser 


Glu 


Leu 


Leu 
85 


Arg 


Ser 


Gly 


Ser 


Ser 
90 


Pro 


Asn 


Leu 


Asn 


Met 
95 


Gly 


val 


Gly 


Gly 


Pro 


Gly 


Gin 


Val 


Met 


Ala 


Ser 


Gin 


Ala 


Gin 


Gin 


Ser 


Ser 




100 










105 










110 







wo 
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Pro 


Gly 


Leu 


Gly 






115 




Ala 


Gly 


Leu 


Thr 




130 






Gin 


Gly 


Pro 


Thr 


145 








Pro 


Ala 


Met 


Gly 


Met 


Leu 


Ala 


Ala 








180 


Asn 


Gly 


Ser 


He 






195 




Asn 


Pro 


Glv 


Met 




210 






Gin 


Gly 


Ser 


Pro 


225 








Pro 


Leu 


Lvs 


Met 


Tyr 


Thr 


Gin 


Asn 








260 


Gin 


He 


Gin 


Thr 






275 




Met 


Asp 


Lys 


Lys 




290 






Gin 


Pro 


Ala 


Pro 


305 








Gin 


Gly 


Met 


Gly 


Leu 


He 


Gin 


Gin 








340 


Arg 


Arg 


Glu 


Gin 






355 




Cvs 


Ara 


Thr 


Met 




370 






Glv 


LVS 


Ser 


Cvs 


385 








Ser 


His 


TrD 


Lys 


Leu 


Lys 


Asn 


Ala 








420 


Ala 


Pro 


Val 


Gl v 






435 




Ser 


Ala 


Pro 


Asn 




450 






Glu 


Ara 


Ala 


Tvr 


465 








Pro 


Thr 


Gin 


Pro 


Gly 


Gin 


Ser 


Pro 








500 


Pro 


Met 


Glv 


Val 






515 




Ser 


Asp 


Ser 


Met 




530 






Ser 


Glu 


Asn 


Ala 


545 








Gin 


Pro 


Ser 


Thr 


Gin 


Asp 


Leu 


Arg 








580 


Pro 


Thr 


Pro 


Asp 






595 





Leu 


He 


Asn 


Ser 








120 


Ser 


Pro 


Asn 


Met 






135 




Gin 


Ser 


Thr 


Gly 




150 






Met 


Asn 


Thr 


Gly 


165 








Gly 


Asn 


Gly 


Gin 


Gly 


Ala 


Gly 


Arg 








200 


Gly 


Ser 


Ala 


Gly 






215 




Gin 


Met 


Glv 


Gly 




230 






Glv 


Met 


Met 


Asn 


245 








Pro 


Gly 


Gin 


Gin 


Lys 


Thr 


Val 


Leu 








280 


Ala 


Val 


Pro 


Gly 






295 




Gin 


Val 


Gin 


Gin 




310 






Ser 


Glv 


Ala 


His 


325 








Gin 


Leu' 


Val 


Leu 


Ala 


Asn 


Gly 


Glu 








360 


LVS 
j 


Asn 


Val 


Leu 






375 




Gin 


Val 


Ala 


His 




390 






Asn 


Cy s 


Thr 




405 








Gly 


Asp 


Lys 


Arg 


Leu 


Gly 


Asn 


Pro 








440 


Leu 


Ser 


Thr 


Val 






455 




Ala 


Ala 


Leu 


Glv 




470 






Gin 


Val 


Gin 


Ala 


485 








Gin 


Glv 


Met 


Arcr 


Asn 


Gly 


Gly 


Val 








520 


Leu 


His 


Ser 


Ala 






535 




Ser 


Val 


Pro 


Ser 




550 






Thr 


Gly 


He 


Arg 


565 








Asn 


His 


Leu 


Val 


Pro 


Ala 


Ala 


. Leu 



600 



77 



Met 


Val 


Lys 


Ser 


Gly 


Met 


Gly 


Thr 








14 0 


Met 


Met 


Asn 


Ser 






155 




Thr 


Asn 


Ala 


Gly 




170 






Gly 


He 


Met 


Pro 


185 








Gly 


Arg 


Gin 


Asp 


Asn 


Leu 


Leu 


Thr 








220 


Gin 


Thr 


Gly 


Leu 






235 




Asn 


Pro 


Asn 


Pro 




250 






He 


Gly 


Ala 


Ser 


265 








Ser 


Asn 


Asn 


Leu 


Gly 


Gly 


Met 


Pro 








300 


Pro 


Gly 


Leu 


Val 






315 




Thr 


Ala 


Asp 


Pro 




330 






Leu 


Leu 


His 


Ala 


345 








Val 


Arg 


Gin 


Cys 


Asn 


His 


Met 


Thr 








380 


Cys 


Ala 


Ser 


Ser 






395 




His 


Asp 


Cys 


Pro 




410 






Asn 


Gin 


Gin 


Pro 


42 5 








Ser 


Ser 


Leu 


Glv 


Ser 


Gin 


He 


Asp 








460 


Leu 


Pro 


Tvr 


Gin 






475 




Lys 


Asn 


Gin 


Gin 




490 






Pro 


Met 


Ser 


Asn 


505 








Glv 


Val 


Gin 


Thr 


He 


Asn 


Ser 


Gin 








540 


Leu 


Gly 


Pro 


Met 






555 




Lys 


Gin 


Trp 


His 




570 






His 


Lys 


Leu 


Val 


585 








Lys 


Asp 


Arg 


Arg 



Pro 


Met 


Thr 


Gin 


125 








Ser 


Gly 


Pro 


Asn 


Pro 


Val 


Asn 


Gin 








160 


Met 


Asn 


Pro 


Gly 






175 




Asn 


Gin 


Val 


Met 




190 






Met 


Gin 


Tyr 


Pro 


205 








Glu 


Pro 


Leu 


Gin 


Arg 


Gly 


Pro 


Gin 








240 


Tvr 


Gly 


Ser 


Pro 






255 




Gly 


Leu 


Gly 


Leu 




270 






Ser 


Pro 


Phe 


Ala 


285 








Asn 


Met 


Gly 


Gin 


Thr 


Pro 


Val 


Ala 








320 


Glu 


Lys 


Arg 


Lys 






335 




His 


Lys 


Cys 


Gin 




350 






Asn 


Leu 


Pro 


His 


365 








His 


Cys 


Gin 


Ser 


Arg 


Gin 


He 


lie 








400 


Val 


Cys 


Leu 


Pro 






415 




He 


Leu 


Thr 


Glv 




430 






Val 


Gl v 


Gin 


Gin 


445 








Pro 


Ser 


Ser 


lie 


Val 


Asn 


Gin 


Met 








480 


Asn 


Gin 


Gin 


Pro 






495 




Met 


ser 


Ala 


Ser 




510 






Pro 


Ser 


Leu 


Leu 


525 








Asn 


Pro 


Met 


Met 


Pro 


Thr 


Ala 


Ala 








560 


Glu 


Asp 


He 


Thr 






575 




Gin 


Ala 


He 


Phe 




590 






Met 


Glu 


Asn 


Leu 



605 
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Val Ala Tyr Ala Arg Lys Val Glu Gly Asp Met Tyr Glu Ser Ala Asn 

610 615 620 

Asn Arg Ala Glu Tyr Tyr His Leu Leu Ala Glu Lys lie Tyr Lys He 
625 630 635 640 

Gin Lys Glu Leu Glu Glu Lys Arg Arg Thr Arg Leu Gin Lys Gin Asn 

645 650 655 

Met Leu Pro Asn Ala Ala Gly Met Val Pro Val Ser Met Asn Pro Gly 

660 665 670 

Pro Asn Met Gly Gin Pro Gin Pro Gly Met Thr Ser Asn Gly Pro Leu 

675 680 685 

Pro Asp Pro Ser Met He Arg Gly Ser Val Pro Asn Gin Met Met Pro 

690 695 700 

Arc He Thr Pro Gin Ser Gly Leu Asn Gin Phe Gly Gin Met Ser Met 
705 710 715 720 

Ala Gin Pro Pro He Val Pro Arg Gin Thr Pro Pro Leu Gin His His 

725 730 735 

Glv Gin Leu Ala Gin Pro Gly Ala Leu Asn Pro Pro Met Gly Tyr Gly 

740 745 750 

Pro Arg Met Gin Gin Pro Ser Asn Gin Gly Gin Phe Leu Pro Gin Thr 

755 760 765 

Gin Phe Pro Ser Gin Gly Met Asn Val Thr Asn He Pro Leu Ala Pro 

770 775 780 

Ser ser Gly Gin Ala Pro Val Ser Gin Ala Gin Met Ser Ser Ser Ser 
785 790 795 800 

Cys Pro Val Asn Ser Pro He Met Pro Pro Gly Ser Gin Gly Ser His 

805 810 815 

He His Cys Pro Gin Leu Pro Gin Pro Ala Leu His Gin Asn Ser Pro 

820 825 830 

Ser Pro Val Pro Ser Arg Thr Pro Thr Pro His His Thr Pro Pro Ser 

835 840 845 

He Gly Ala Gin Gin Pro Pro Ala Thr Thr He Pro Ala Pro Val Pro 

850 855 860 

Thr Pro Pro Ala Met Pro Pro Gly Pro Gin Ser Gin Ala Leu His Pro 
865 870 875 880 

Pro Pro Arg Gin Thr Pro Thr Pro Pro Thr Thr Gin Leu Pro Gin Gin 

885 890 895 

Val Gin Pro Ser Leu Pro Ala Ala Pro Ser Ala Asp Gin Pro Gin Gin 

900 905 910 

Gin Pro Arg Ser Gin Gin Ser Thr Ala Ala Ser Val Pro Thr Pro Asn 

915 920 925 

Ala Pro Leu Leu Pro Pro Gin Pro Ala Thr Pro Leu Ser Gin Pro Ala 

930 935 940 

Val Ser He Glu Gly Gin Val Ser Asn Pro Pro ser Thr Ser Ser Thr 
945 950 955 960 

Glu Val Asn Ser Gin Ala He Ala Glu Lys Gin Pro Ser Gin Glu Val 

965 970 975 

Lys Met Glu Ala Lys Met Glu Val Asp Gin Pro Glu Pro Ala Asp Thr 

980 985 990 

Gin Pro Glu Asp He Ser Glu Ser Lys Val Glu Asp Cys Lys Met Glu 

995 1000 1005 

Ser Thr Glu Thr Glu Glu Arg Ser Thr Glu Leu Lys Thr Glu He Lys 

1010 1015 1020 

Glu Glu Glu Asp Gin Pro Ser Thr Ser Ala Thr Gin Ser Ser Pro Ala 
025 1030 1035 1040 

Pro Gly Gin Ser Lys Lys Lys He Phe Lys Pro Glu Glu Leu Arg Gin 

1045 1050 1055 

Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg Gin Asp Pro Glu Ser 

1060 1065 1070 

Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu Leu Gly He Pro Asp 

1075 1080 1085 

Tyr Phe Asp He Val Lys Ser Pro Met Asp Leu Ser Thr He Lys Arg 
1090 1095 1100 
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Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp Gin Tyr Val Asp Asp 
105 1110 1115 1120 

lie Trp Leu Met Phe Asn Asn Ala Trp Leu Tyr Asn Arg Lys Thr Ser 

1125 1130 1135 

Arg Val Tyr Lys Tyr Cys Ser Lys Leu Ser Glu Val Phe Glu Gin Glu 

1140 1145 1150 

lie Asp Pro Val Met Gin Ser Leu Gly Tyr Cys Cys Gly Arg Lys Leu 

1155 1160 1165 

Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly Lys Gin Leu Cys Thr 

1170 1175 1180 

lie Pro Arg Asp Ala Thr Tyr Tyr Ser Tyr Gin Asn Arg Tyr His Phe 
185 1190 1195 1200 

Cys Glu Lys Cys Phe Asn Glu lie Gin Gly Glu Ser Val Ser Leu Gly 

1205 1210 1215 

Asp Asp Pro Ser Gin Pro Gin Thr Thr lie Asn Lys Glu Gin Phe Ser 

1220 1225 1230 

Lys Arg Lys Asn Asp Thr Leu Asp Pro Glu Leu Phe Val Glu Cys Thr 

1235 1240 1245 

Glu Cys Gly Arg Lys Met His Gin He Cys Val Leu His His Glu lie 

1250 1255 1260 

He Trp Pro Ala Gly Phe Val Cys Asp Gly Cys Leu Lys Lys Ser Ala 
265 1270 1275 1280 

Arg Thr Arg Lys Glu Asn Lys Phe Ser Ala Lys Arg Leu Pro Ser Thr 

1285 1290 1295 

Arg Leu Gly Thr Phe Leu Glu Asn Arg Val Asn Asp Phe Leu Arg Arg 

1300 1305 1310 

Gin Asn His Pro Glu Ser Gly Glu Val Thr Val Arg Val Val His Ala 

1315 1320 1325 

Ser Asp Lys Thr Val Glu Val Lys Pro Gly Met Lys Ala Arg Phe Val 

1330 1335 1340 

Asp Ser Gly Glu Met Ala Glu Ser Phe Pro Tyr Arg Thr Lys Ala Leu 
345 1350 1355 1360 

Phe Ala Phe Glu Glu He Asp Gly Val Asp Leu Cys Phe Phe Gly Met 

1365 1370 1375 

His Val Gin Glu Tyr Gly Ser Asp Cys Pro Pro Pro Asn Gin Arg Arg 

1380 1385 1390 

Val Tyr He Ser Tyr Leu Asp Ser Val His Phe Phe Arg Pro Lys Cys 

1395 1400 1405 

Leu Arg Thr Ala Val Tyr His Glu He Leu He Gly Tyr Leu Glu Tyr 

1410 1415 1420 

Val Lys Lys Leu Gly Tyr Thr Thr Gly His He Trp Ala Cys Pro Pro 
425 1430 . 1435 1440 

Ser Glu Gly Asp Asp Tyr He Phe His Cys His Pro Pro Asp Gin Lys 

1445 1450 1455 

He Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr Lys Lys Met Leu Asp 

1460 1465 1470 

Lys Ala Val Ser Glu Arg He Val His Asp Tyr Lys Asp He Phe Lys 

1475 1480 1485 

Gin Ala Thr Glu Asp Arg Leu Thr Ser Ala Lys Glu Leu Pro Tyr Phe 

1490 1495 1500 

Glu Gly Asp Phe Trp Pro Asn Val Leu Glu Glu Ser He Lys Glu Leu 
505 1510 1515 1520 

Glu Gin Glu Glu Glu Glu Arg Lys Arg Glu Glu Asn Thr Ser Asn Glu 

1525 1530 1535 

Ser Thr Asp Val Thr Lys Gly Asp Ser Lys Asn Ala Lys Lys Lys Asn 

1540 1545 1550 

Asn Lys Lys Thr Ser Lys Asn Lys Ser Ser Leu Ser Arg Gly Asn Lys 

1555 1560 1565 

Lys Lys Pro Gly Met Pro Asn Val Ser Asn Asp Leu Ser Gin Lys Leu 

1570 1575 1580 

Tyr Ala Thr Met Glu Lys His Lys Glu Val Phe Phe Val He Arg Leu 
585 1590 1595 1600 
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He Ala Gly Pro Ala Ala Asn Ser Leu Pro Pro He Val Asp Pro Asp 

1605 1610 1615 

Pro Leu He Pro Cys Asp Leu Met Asp Gly Arg Asp Ala Phe Leu Thr 

1620 1625 1630 

Leu Ala Arg Asp Lys His Leu Glu Phe Ser Ser Leu Arg Arg Ala Gin 

1635 1640 1645 

Trp Ser Thr Met Cys Met Leu Val Glu Leu His Thr Gin Ser Gin Asp 

1650 1655 1660 

Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys His His Val Glu Thr Arg 
665 1670 1675 1680 

Trp His Cys Thr Val Cys Glu Asp Tyr Asp Leu Cys He Thr Cys Tyr 

1685 1690 1695 

Asn Thr Lys Asn His Asp His Lys Met Glu Lys Leu Gly Leu Gly Leu 

1700 1705 1710 

Asp Asp Glu Ser Asn Asn Gin Gin Ala Ala Ala Thr Gin Ser Pro Gly 

1715 1720 1725 

Asp Ser Arg Arg Leu Ser He Gin Arg Cys He Gin Ser Leu Val His 

1730 1735 1740 

Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser Leu Pro Ser Cys Gin Lys 
745 1750 1755 1760 

Met Lys Arg Val Val Gin His Thr Lys Gly Cys Lys Arg Lys Thr Asn 

1765 1770 1775 

Glv Gly Cys Pro He Cys Lys Gin Leu He Ala Leu Cys Cys Tyr His 

1780 1785 1790 

Ala Lys His Cys Gin Glu Asn Lys Cys Pro Val Pro Phe Cys Leu Asn 

1795 1800 1805 

He Lys Gin Lys Leu Arg Gin Gin Gin Leu Gin His Arg Leu Gin Gin 

1810 1815 1820 

Ala Gin Met Leu Arg Arg Arg Met Ala Ser Met Gin Arg Thr Gly Val 
825 1830 1835 1840 

Val Gly Gin Gin Gin Gly Leu Pro Ser Pro Thr Pro Ala Thr Pro Thr 

1845 1850 1855 

Thr Pro Thr Gly Gin Gin Pro Thr Thr Pro Gin Thr Pro Gin Pro Thr 

I860 1865 1870 

Ser Gin Pro Gin Pro Thr Pro Pro Asn Ser Met Pro Pro Tyr Leu Pro 

1875 1880 1885 

Arg Thr Gin Ala Ala Gly Pro Val Ser Gin Gly Lys Ala Ala Gly Gin 

1890 1895 1900 

Val Thr Pro Pro Thr Pro Pro Gin Thr Ala Gin Pro Pro Leu Pro Gly 
905 1910 1915 1920 

Pro Pro Pro Thr Ala Val Glu Met Ala Met Gin lie Gin Arg Ala Ala 

1925 1930 1935 

Glu Thr Gin Arg Gin Met Ala His Val Gin He Phe Gin Arg Pro He 

1940 1945 1950 

Gin His Gin Met Pro Pro Met Thr Pro Met Ala Pro Met Gly Met Asn 

1955 I960 1965 

Pro Pro Pro Met Thr Arg Gly Pro Ser Gly His Leu Glu Pro Gly Met 

1970 1975 1980 

Gly Pro Thr Gly Met Gin Gin Gin Pro Pro Trp Ser Gin Gly Gly Leu 
985 1990 1995 2000 

Pro Gin Pro Gin Gin Leu Gin Ser Gly Met Pro Arg Pro Ala Met Met 

2005 2010 2015 

Ser Val Ala Gin His Gly Gin Pro Leu Asn Met Ala Pro Gin Pro Gly 

2020 2025 2030 

Leu Gly Gin Val Gly He Ser Pro Leu Lys Pro Gly Thr Val Ser Gin 

2035 2040 2045 

Gin Ala Leu Gin Asn Leu Leu Arg Thr Leu Arg Ser Pro Ser Ser Pro 

2050 2055 2060 

Leu Gin Gin Gin Gin Val Leu Ser He Leu His Ala Asn Pro Gin Leu 
065 2070 2075 2080 

Leu Ala Ala Phe He Lys Gin Arg Ala Ala Lys Tyr Ala Asn Ser Asn 
2085 2090 2095 
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Pro Gin Pro lie Pro Gly Gin Pro Gly Met Pro Gin Gly Gin Pro Gly 

2100 2105 2110 

Leu Gin Pro Pro Thr Met Pro Gly Gin Gin Gly Val His Ser Asn Pro 

2115 2120 2125 

Ala Met Gin Asn Met Asn Pro Met Gin Ala Gly Val Gin Arg Ala Gly 

2130 2135 2140 

Leu Pro Gin Gin Gin Pro Gin Gin Gin Leu Gin Pro Pro Met Gly Gly 
145 2150 2155 2160 

Met Ser Pro Gin Ala Gin Gin Met Asn Met Asn His Asn Thr Met Pro 

2165 2170 2175 

Ser Gin Phe Arg Asp lie Leu Arg Arg Gin Gin Met Met Gin Gin Gin 

2180 2185 2190 

Gin Gin Gin Gly Ala Gly Pro Gly lie Gly Pro Gly Met Ala Asn His 

2195 2200 2205 

Asn Gin Phe Gin Gin Pro Gin Gly Val Gly Tyr Pro Pro Gin Pro Gin 

2210 2215 2220 

Gin Arg Met Gin His His Met Gin Gin Met Gin Gin Gly Asn Met Gly 
225 2230 2235 2240 

Gin lie Gly Gin Leu Pro Gin Ala Leu Gly Ala Glu Ala Gly Ala Ser 

2245 2250 2255 

Leu Gin Ala Tyr Gin Gin Arg Leu Leu Gin Gin Gin Met Gly Ser Pro 

2260 2265 2270 

Val Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu Pro Asn Gin 

2275 2280 2285 

Ala Gin Ser Pro His Leu Gin Gly Gin Gin lie Pro Asn Ser Leu Ser 

2290 2295 2300 

Asn Gin Val Arg Ser Pro Gin Pro Val Pro Ser Pro Arg Pro Gin Ser 
305 2310 2315 2320 

Gin Pro Pro His Ser Ser Pro Ser Pro Arg Met Gin Pro Gin Pro Ser 

2325' 2330 2335 

Pro His His Val Ser Pro Gin Thr Ser Ser Pro His Pro Gly Leu Val 

2340 2345 2350 

Ala Ala Gin Ala Asn Pro Met Glu Gin Gly His Phe Ala Ser Pro Asp 

2355 2360 2365 

Gin Asn Ser Met Leu Ser Gin Leu Ala Ser Asn Pro Gly Met Ala Asn 

2370 2375 2380 

Leu His Gly Ala Ser Ala Thr Asp Leu Gly Leu Ser Thr Asp Asn Ser 
385 2390 2395 2400 

Asp Leu Asn Ser Asn Leu Ser Gin Ser Thr Leu Asp lie His 
2405 2410 2 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2441 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 7 : 

Met Ala Glu Asn Leu Leu Asp Gly Pro Pro Asn Pro Lys Arg Ala Lys 

1.5 10 15 

Leu Ser Ser Pro Gly Phe Ser Ala Asn Asp Asn Thr Asp Phe Gly Ser 

20 25 . 30 

Leu Phe Asp Leu Glu Asn Asp Leu Pro Asp Glu Leu lie Pro Asn Gly 

35 40 45 

Glu Leu Ser Leu Leu Asn Ser Gly Asn Leu Val Pro Asp Ala Ala Ser 

50 55 60 

Lys His Lys Gin Leu Ser Glu Leu Leu Arg Gly Gly Ser Gly Ser Ser 
65 70 75 80 
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lie Asn Pro Gly 

Gly Leu Gly Gly 
100 

Ser Leu Gly Ala 
115 

Thr Pro Asn Leu 
130 

Pro Ala Ser Gin 
145 

Val Thr Ser Ser 

Asn Ala Asn Phe 
180 

Gly His ser Leu 
195 

Asn Gly Ser Leu 
210 

Tyr Pro Ala Pro 
225 

Thr Leu Thr Gin 

Thr Ala Gin Ala 
260 

Ser Pro Phe Gly 
275 

Ala Thr Gly Val 

290 

Ser Leu Pro Ala 
305 

Val Pro Asn Met 

Gin Ala lie Ala 
340 

lie Gin Gin Gin 
355 

Arg Glu Gin Ala 

370 

Arg Thr Met Lys 
385 

Lys Ala Cys Gin 

His Trp Lys Asn 
420 

Lys Asn Ala Ser 
435 

Ala Ser Gly lie 
450 

Asn Ala Thr Ser 
465 

Gin Arg Ala Tyr 

Thr Gin Leu Gin 
500 

Ala His Gin Gin 
515 

Ser Val Pro Ala 

530 

He Ser Glu Ser 
545 

Met Asn Asp Gly 



He Gly Asn Val 
85 

Gin Ala Gin Gly 

Met Gly Lys Ser 
120 

Pro Lys Gin Ala 
135 

Ala Leu Asn Pro 
150 

Pro Ala Thr Ser 
165 

Asn Gin Thr His 

Met Asn Gin Ala 

200 

Gly Ala Ala Gly 
215 

Ala Met Gin Gly 
230 

Val Ser Pro Gin 
2 45 

Gly Gly Met Thr 

Gin Pro Phe Ser 
280 

Asn Pro Gin Leu 
295 

Phe Pro Thr Asp 
310 

Ser Gin Leu Gin 
325 

Thr Gly Pro Thr 

Leu Val Leu Leu 
360 

Asn Gly Glu Val 
375 

Asn Val Leu Asn 
390 

Val Ala His Cys 
405 

Cys Thr Arg His 

Asp Lys Arg Asn 
440 

Gin Asn Thr He 
455 

Leu Ser Asn Pro 
470 

Ala Ala Leu Gly 
485 

Pro Gin Val Pro 

Met Arg Thr Leu 
520 

Gly Gly He Thr 
535 

Ala Leu Pro Thr 
550 

Ser Asn Ser Gly 
565 



82 

Ser Ala Ser Ser 
90 

Gin Pro Asn Ser 

105 

Pro Leu Asn Gin 

Ala Ser Thr Ser 
140 

Gin Ala Gin Lys 
155 

Gin Thr Gly Pro 
170 

Pro Gly Leu Leu 
185 

Gin Gin Gly Gin 

Arg Gly Arg Gly 
220 

Ala Thr Ser Ser 
235 

Met Ala Gly His 
250 

Lys Met Gly Met 
265 

Gin Thr Gly Gly 

Ala Ser Lys Gin 
300 

He Lys Asn Thr 
315 

Thr Ser Val Gly 
330 

Ala Asp Pro Glu 
345 

Leu His Ala His 

Arg Ala Cys Ser 
380 

His Met Thr His 
395 

Ala Ser Ser Arg 
410 

Asp Cys Pro Val 
425 

Gin Gin Thr He 

Gly Ser Val Gly 
460 

Asn Pro He Asp 
475 

Leu Pro Tyr Met 
490 

Gly Gin Gin Pro 
505 

Asn Ala Leu Gly 

Thr Asp Gin Gin 
540 

Ser Leu Gly Ala 
555 

Asn He Gly Ser 
570 



Pro Val Gin Gin 
95 

Thr Asn Met Ala 

110 

Gly Asp Ser Ser 
125 

Gly Pro Thr Pro 

Gin Val Gly Leu 
160 

Gly. He Cys Met 
175 

Asn Ser Asn Ser 
190 

Ala Gin Val Met 

205 

Ala Gly Met Pro 

Val Leu Ala Glu 

240 

Ala Gly Leu Asn 
255 

Thr Gly Thr Thr 
27C 

Gin Gin Met Gly 

285 

Ser Met Val Asn 

Ser Val Thr Thr 
320 

He Val Pro Thr 
335 

Lys Arg Lys Leu 
350 

Lys Cys Gin Arg 
365 

Leu Pro His Cys 

Cys Gin Ala Pro 
400 

Gin He He Ser 
415 

Cys Leu Pro Leu 
430 

Leu Gly Ser Pro 
445 

Ala Gly Gin Gin 

Pro Ser Ser Met 
480 

Asn Gin Pro Gin 
495 

Ala Gin Pro Pro 
510 

Asn Asn Pro Met 
525 

Pro Pro Asn Leu 

Thr Asn Pro Leu 
560 

Leu Ser Thr He 
575 



i 
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Pro Thr Ala Ala 
580 

Glu His Val Thr 
595 

Gin Ala lie Phe 
610 

Met Glu Asn Leu 
625 

Glu Ser Ala Asn 

lie Tyr Lys lie 
660 

His Lys Gin Gly 
675 

Ala Gin Pro Pro 
690 

Gly Pro Leu Pro 
705 

Asn Ser Phe Asn 

Pro Met Gly Pro 
740 

Asn Ser Met Ala 
755 

Pro Gin Pro Pro 
770 

Gin Ala Pro Thr 
785 

Ser Ser Gly Ala 

Ala Gin Ala Gly 
820 

Asn Pro Leu Asn 
835 

Pro Val Thr Gin 
850 

Ala Ala Gly Met 
865 

Pro Pro Gin Pro 

Gly Gin Thr Pro 
900 

Thr Gin Ser Thr 
915 

Pro Gin Pro Gin 
930 

Ser Ser Gin Gin 
945 

Pro Leu Ser Gin 

Ser Thr Val Thr 
980 

Val Pro Met Leu 
995 

Pro Glu Pro Thr 

1010 
Glu Asp Leu Gin 
025 

Glu Gin Lys Ser 

Lys Val Glu Ala 
1060 



Pro Pro Ser Ser 

Gin Asp Leu Arg 
600 

Pro Thr Pro Asp 

615 

Val Ala Tyr Ala 
630 

Ser Arg Asp Glu 
645 

Gin Lys Glu Leu 

lie Leu Gly Asn 
680 

Val lie Pro Pro 

695 

Leu Pro Val Asn 
710 

Pro Met Ser Leu 
725 

Arg Ala Ala Ser 

Ser Val Pro Gly 
760 

Asn Met Met Gly 
775 

Gin Asn Gin Phe 
790 

Met Ser Val Asn 
805 

Val Ser Gin Gly 

Met Leu Ala Pro 
840 

Ser Pro Leu His 
855 

Pro Ser Leu Gin 
870 

Ala Ala Pro Thr 
885 

Thr Pro Thr Pro 

Pro Thr Val Gin 
920 

Thr Pro Val Gin 

935 

Gin Pro Thr Pro 

950 

Ala Ala Ala Ser 
965 

Ser Ala Glu Thr 

Glu Met Lys Thr 
1000 

Glu Ser Lys Gly 

1015 

Gly Ser Ser Gin 
1030 

Glu Pro Met Glu 
045 

Lys Glu Glu Glu 



83 

Thr Gly Val Arg 
585 

Ser His Leu Val 

Pro Ala Ala Leu 
620 

Lys Lys Val Glu 
635 

Tyr Tyr His Leu 

650 

Glu Glu Lys Arg 
665 

Gin Pro Ala Leu 

Ala Gin Ser Val 
700 

Arg Met Gin Val 
715 

Gly Asn Val Gin 
730 

Pro Met Asn His 
745 

Met Ala lie Ser 

Thr His Ala Asn 

780 

Leu Pro Gin Asn 
795 

Ser Val Gly Met 
810 

Gin Glu Pro Gly 
825 

Gin Ala Ser Gin 

Pro Thr Pro Pro 
860 

His Pro Thr Ala 
875 

Gin Pro Ser Thr 
890 

Gly Ser Val Pro 
905 

Ala Ala Ala Gin 

Pro Pro Ser Val 
940 

Val His Thr Gin 
955 

lie Asp Asn Arg 

970 

Ser Ser Gin Gin 
985 

Glu Val Gin Thr 

Glu Pro Arg Ser 
1020 

Val Lys Glu Glu 

1035 

Val Glu Glu Lys 
1050 

Glu Asn Ser Ser 
.065 



Lys 


Gly 


Trp 


His 




590 






His 


Lys 


Leu 


Val 


605 








Lvs 


Asp 


Ara 


Ara 


Gly 


Asp 


Met 


Tyr 








640 


Leu 


Ala 


Glu 


Lys 






655 




Ara 


Thr 


Arg 


Leu 




670 






Pro 


Ala 


Ser 


Gl v 


685 








Ara 


Pro 


Pro 


As n 


Ser 


Gin 


Gly 


Met 








720 


Leu 


O y r\ 
nu 










7 35 




Ser 


Val 


Gin 


Met 




750 






Pro 


Ser 


Arg 


Met 


765 








Asn 


lie 


Met 


Ala 


Gin 


Phe 


Pro 


Ser 








800 


Gly 


Gin 


Pro 


Ala 






815 




Ala 


Ala 


Leu 


Pro 




830 






Leu 


Pro 


Cy s 


Pro 


845 








Pro 


Ala 


Ser 


Thr 


Pro 


Gly 


Met 


Thr 








880 


Pro 


Val 


Ser 


Ser 






O _? *J 




Q » r~ 
O " i. 






ri 1 n 

ul J 1 




910 






Ala 


Gin 


Val 


Thr 


925 








Ala 


Thr 


Pro 


Gin 


Pro 


Pro 


Gly 


Thr 








960 


Val 


Pro 


Thr 


Pro 






975 




Pro 


Gl v 
y 


Pro 


Asp 




990 






Asp 


Asp 


Ala 


Glu 


.005 








Glu 


Met 


Met 


Glu 


Thr 


Asp 


Thr 


Thr 






1040 


Lys 


Pro 


Glu 


Val 




1055 




Asn 


Asp 


Thr 


Ala 
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Ser Gin Ser Thr Ser Pro Ser Gin Pro Arg Lys Lys lie Phe Lys Pro 

1075 1080 1085 

Glu Glu Leu Arg Gin Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg 

1090 1095 1100 

Gin Asp Pro Glu Ser Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu 
105 IHO 1115 1120 

Leu Gly lie Pro Asp Tyr Phe Asp lie Val Lys Asn Pro Met Asp Leu 

1125 1130 1135 

Ser Thr lie Lys Arg Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp 

1140 1145 1150 

Gin Tyr Val Asp Asp Val Arg Leu Met Phe Asn Asn Ala Trp Leu Tyr 

1155 H60 1165 

Asn Arg Lys Thr Ser Arg Val Tyr. Lys Phe Cys Ser Lys Leu Ala Glu 

1170 H 7 ^ H 80 

Val Phe Glu Gin Glu lie Asp Pro Val Met Gin Ser Leu Gly Tyr Cys 
185 1190 1195 1200 

Cys Gly Arg Lys Tyr Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly 

1205 1210 1215 

Lys Gin Leu Cys Thr He Pro Arg Asp Ala Ala Tyr Tyr Ser Tyr Gin 

1220 1225 1230 

Asn Arg Tyr His Phe Cys Gly Lys Cys Phe Thr Glu He Gin Gly Glu 

1235 1240 1245 

Asn Val Thr Leu Gly Asp Asp Pro Ser Gin Pro Gin Thr Thr He Ser 

1250 1255 . 1260 

Lys Asp Gin Phe Glu Lys Lys Lys Asn Asp Thr Leu Asp Pro Glu Pro 
265 1270 1275 1280 

Phe Val Asp Cys Lys Glu Cys Gly Arg Lys Met His Gin He Cys val 

1285 1290 1295 

Leu His Tyr Asp He He Trp Pro Ser Gly Phe Val Cys Asp Asn Cys 

1300 1305 1310 

Leu Lys Lys Thr Gly Arg Pro Arg Lys Glu Asn Lys Phe Ser Ala Lys 

1315 1320 1325 

Arg Leu Gin Thr Thr Arg Leu Gly Asn His Leu Glu Asp Arg Val Asn 

1330 1335 1340 

Lys Phe Leu Arg Arg Gin Asn His Pro Glu Ala Gly Glu Val Phe Val 
345 1350 1355 1360 

Arg Val Val Ala Ser Ser Asp Lys Thr Val Glu Val Lys Pro Gly Met 

1365 1370 1375 

Lys Ser Arg Phe Val Asp Ser Gly Glu Met Ser Glu Ser Phe Pro Tyr 

1380 1385 1390 

Arg Thr Lys Ala Leu Phe Ala Phe Glu Glu He Asp Gly Val Asp Val 

1395 1400 1405 

Cys Phe Phe Gly Met His Val Gin Asp Thr Ala Leu He Ala Pro His 

1410 1415 1420 

Gin He Gin Gly Cys Val Tyr He Ser Tyr Leu Asp Ser He His Phe 
425 1430 1435 1440 

Phe Arg Pro Arg Cys Leu Arg Thr Ala Val Tyr His Glu lie Leu He 

1445 1450 1455 

Gly Tyr Leu Glu Tyr Val Lys Lys Leu Val Tyr Val Thr Ala His He 

1460 1465 1470 

Trp Ala Cys Pro Pro Ser Glu Gly Asp Asp Tyr lie Phe His Cys His 

1475 1480 1485 

Pro Pro Asp Gin Lys He Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr 

1490 1495 1500 

Lvs Lys Met Leu Asp Lys Ala Phe Ala Giu Arg He He Asn Asp Tyr 
505 1510 1515 1520 

Lys Asp He Phe Lys Gin Ala Asn Glu Asp Arg Leu Thr Ser Ala Lys 

1525 1530 1535 

Glu Leu Pro Tyr Phe Glu Gly Asp Phe Trp Pro Asn Val Leu Glu Glu 

1540 1545 1550 

Ser lie Lys Glu Leu Glu Gin Glu Glu Glu Glu Arg Lys Lys Glu Glu 
1555 1560 1565 
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Ser Thr Ala Ala Ser Glu Thr Pro Glu Gly Ser Gin Gly Asp Ser Lys 

1570 1575 1580 

Asn Ala Lys Lys Lys Asn Asn Lys Lys Thr Asn Lys Asn Lys Ser Ser 
585 1590 1595 1600 

lie Ser Arg Ala Asn Lys Lys Lys Pro Ser Met Pro Asn Val Ser Asn 

1605 1610 1615 

Asp Leu Ser Gin Lys Leu Tyr Ala Thr Met Glu Lys His Lys Glu Val 

1620 1625 1630 

Phe Phe Val lie His Leu His Ala Gly Pro Val lie Ser Thr Gin Pro 

1635 1640 1645 

Pro lie Val Asp Pro Asp Pro Leu Leu Ser Cys Asp Leu Met Asp Gly 

1650 1655 1660 

Arg Asp Ala Phe Leu Thr Leu Ala Arg Asp Lys His Trp Glu Phe Ser 
665 * 1670 1675 1680 

Ser Leu Arg Arg Ser Lys Trp Ser Thr Leu Cys Met Leu Val Glu Leu 

1685 1690 1695 

His Thr Gin Gly Gin Asp Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys 

1700 1705 1710 

His His Val Glu Thr Arg Trp His Cys Thr Val Cys Glu Asp Tyr Asp 

1715 1720 1725 

Leu Cys lie Asn Cys Tyr Asn Thr Lys Ser His Thr His Lys Met Val 

1730 1735 1740 

Lys Trp Gly Leu Gly Leu Asp Asp Glu Gly Ser Ser Gin Gly Glu Pro 
745 1750 1755 1760 

Gin Ser Lys Ser Pro Gin Glu Ser Arg Arg Leu Ser lie Gin Arg Cys 

1765 1770 1775 

lie Gin Ser Leu Val His Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser 

1780 1785 1790 

Leu Pro Ser Cys Gin Lys Met Lys Arg Val Val Gin His Thr Lys Gly 

1795 1800 . 1805 

Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys Lys Gin Leu lie 

1810 1815 1820 

Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu Asn Lys Cys Pro 
825 1830 1835 1840 

Val Pro Phe Cys Leu Asn lie Lys His Asn Val Arg Gin Gin Gin lie 

1845 1850 1855 

Gin His Cys Leu Gin Gin Ala Gin Leu Met Arg Arg Arg Met Ala Thr 

1860 1865 1870 

Met Asn Thr Arg Asn Val Pro Gin Gin Ser Leu Pro Ser Pro Thr Ser 

1875 1880 1885 

Ala Pro Pro Gly Thr Pro Thr Gin Gin Pro Ser Thr Pro Gin Thr Pro 

1890 1895 1900 

Gin Pro Pro Ala Gin Pro Gin Pro Ser Pro Val Asn Met Ser Pro Ala 
905 1910 1915 1920 

Gly Phe Pro Asn Val Ala Arg Thr Gin Pro Pro Thr lie Val Ser Ala 

1925 1930 1935 

Gly Lys Pro Thr Asn Gin Val Pro Ala Pro Pro Pro Pro Ala Gin Pro 

1940 1945 1950 

Pro Pro Ala Ala Val Glu Ala Ala Arg Gin lie Glu Arg Glu Ala Gin 

1955 1960 1965 

Gin Gin Gin His Leu Tyr Arg Ala Asn lie Asn Asn Gly Met Pro Pro 

1970 1975 1980 

Gly Arg Asp Gly Met Gly Thr Pro Gly Ser Gin Met Thr Pro Val Gly 
985 1990 1995 2000 

Leu Asn Val Pro Arg Pro Asn Gin Val Ser Gly Pro Val Met Ser Ser 

2005 2010 2015 

Met Pro Pro Gly Gin Trp Gin Gin Ala Pro lie Pro Gin Gin Gin Pro 

2020 2025 2030 

Met Pro Gly Met Pro Arg Pro Val Met Ser Met Gin Ala Gin Ala Ala 

2035 2040 2045 

Val Ala Gly Pro Arg Met Pro Asn Val Gin Pro Asn Arg Ser lie Ser 
2050 2055 2060 



WO 98/03652 



PCT/US97/12877 



86 

Pro Ser Ala Leu Gin Asp Leu Leu Arg Thr Leu Lys Ser Pro Ser Ser 
065 2070 2075 2080 

Pro Gin Gin Gin Gin Gin Val Leu Asn He Leu Lys Ser Asn Pro Gin 

2085 2090 2095 

Leu Met Ala Ala Phe He Lys Gin Arg Thr Ala Lys Tyr Val Ala Asn 

2100 2105 2110 

Gin Pro Gly Met Gin Pro Gin Pro Gly Leu Gin Ser Gin Pro Gly Met 

2115 2120 2125 

Gin Pro Gin Pro Gly Met His Gin Gin Pro Ser Leu Gin Asn Leu Asn 

2130 2135 2140 

Ala Met Gin Ala Gly Val Pro Arg Pro Gly Val Pro Pro Pro Gin Pro 
145 2150 2155 2160 

Ala Met Gly Gly Leu Asn Pro Gin Gly Gin Ala Leu Asn lie Met Asn 

2165 2170 2175 

Pro Gly His Asn Pro Asn Met Thr Asn Met Asn Pro Gin Tyr Arg Glu 

2180 2185 2190 

Met Val Arg Arg Gin Leu Leu Gin His Gin Gin Gin Gin Gin Gin Gin 

2195 2200 2205 

Gin Gin Gin Gin Gin Gin Gin Gin Asn Ser Ala Ser Leu Ala Gly Gly 

2210 2215 2220 

Met Ala Gly His Ser Gin Phe Gin Gin Pro Gin Gly Pro Gly Gly Tyr 
225 2230 2235 2240 

Ala Pro Ala Met Gin Gin Gin Arg Met Gin Gin His Leu Pro He Gin 

2245 2250 2255 

Gly Ser Ser Met Gly Gin Met Ala Ala Pro Met Gly Gin Leu Gly Gin 

2260 2265 2270 

Met Gly Gin Pro Gly Leu Gly Ala Asp Ser Thr Pro Asn He Gin Gin 

2275 2280 2285 

Ala Leu Gin Gin Arg He Leu Gin Gin Gin Gin Met Lys Gin Gin He 

2290 2295 2300 

Glv Ser Pro Gly Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu 
305 2310 2315 2320 

Ser Gly Gin Pro Gin Ala Ser His Leu Pro Gly Gin Gin He Ala Thr 

2325 2330 2335 

Ser Leu Ser Asn Gin Val Arg Ser Pro Ala Pro Val Gin Ser Pro Arg 

2340 2345 2350 

Pro Gin Ser Gin Pro Pro His Ser Ser Pro Ser Pro Arg He Gin Pro 

2355 2360 2365 

Gin Pro Ser Pro His His Val Ser Pro Gin Thr Gly Thr Pro His Pro 

2370 2375 2380 

Gly Leu Ala Val Thr Met Ala Ser Ser Met Asp Gin Gly His Leu Gly 
385 2390 2395 2400 

Asn Pro Glu Gin Ser Ala Met Leu Pro Gin Leu Asn Thr Pro Asn Arg 

2405 2410 2415 

Ser Ala Leu Ser Ser Glu Leu Ser Leu Val Gly Asp Thr Thr Gly Asp 

2420 2425 2430 

Thr Leu Glu Lys Phe Val Glu Gly Leu 
2435 2440 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ala Glu Ala Gly Gly Ala Gly Ser Pro Ala Leu Pro Pro Ala Pro 
1 5 10 15 
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87 



Pro 


His 


Gly 


Ser 


Pro 


Arg 


Thr 


Leu 








20 










Ser 


Cvs 


Gly 


Pro 


Ala 


Thr 


Pro 


Val 






35 










40 


Pro 


Gly 


Gly 


Gly 


Gly 


Ser 


Ala 


Arg 




50 










55 




Arg 


Ser 


Ala 


Pro 


Arg 


Ala 


Lys 


Lys 


65 










70 






Ala 


Cys 


Lys 


Ala 


Glu 


Glu 


Ser 


Cys 










85 








Asn 


Pro 


Ser 


Pro 


Thr 


Pro 


Pro 


Arg 








100 










Ser 


Leu 


Thr 


GlU 


Ser 


Cys 


Arg 


Ser 






115 










120 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 




130 










135 




Glv 


He 


Val 


Leu 


Asp 


Val 


Glu 


Tyr 


145 










150 






Glu 


Asp 


Ala 


Asp 


Thr 


Lys 


Gin 


Val 










165 








Arg 


LVS 


Ser 


He 


Leu 


Gin 


Arg 


Gly 








180 










Glu 


Lys 


Lys 


Pro 


Pro 


Phe 


Glu 


Lys 






195 










200 


Asn 


Phe 


Val 


Gin 


Tvr 


Lvs 


Phe 


Ser 




210 










215 




Thr 


Thr 


He 


Glu 


Leu 


Ala 


Lys 


Met 


225 










230 






His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Arg 










245 








He 


Ser 


Glv 


Tvr 


LVS 


Glu 


Asn 


Tyr 








260 










Val 


Pro 


Gin 


Phe 


Cvs 


Asp 


Ser 


Leu 






275 










280 


Phe 


Glv 


Arg 


Thr 


Leu 


Leu 


Ara 


Ser 




290 










295 




Leu. 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Lvs 


305 










310 






Ar g 


Thr 


Leu 


He 


Leu 


Thr 


His 


Phe 










325 








Glu 


Glu 


Val 


T v 


Ser 


Gin 


Asn 


Ser 








340 










Ser 


Ala 


Ser 


Ser 


Arg 


Thr 


Ser 


Pro 






355 










360 


Pro 


Pro 


Val 


Thr 


Glv 


Thr 


Ala 


Leu 




370 










375 




Glu 


Gin 


lie 


Asn 


Glv 


Glv 


Arg 


Thr 


385 










390 






Glv 


Leu 


Glu 


Ala 


Asn 


Pro 


Gly 


Glu 










405 








Ala 


Pro 


Glu 


Glu 


Ala 


Lys 


Arg 


Ser 








420 










Glu 


Leu 


He 


Asn 


Glu 


Val 


Met 


Ser 






435 










440 


Leu 


Gly 


Pro 


Glu 


Thr 


Asn 


Phe 


Leu 




450 










455 




Ala 


Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


465 










470 






Gly 


Asn 


Ser 


Leu 


Asn 


Gin 


Lys 


Pro 










485 








Val 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 



500 



Ala 


Thr 


Ala 


Ala 


Gly 


Ser 


Ser 


Ala 


25 










30 






Ala 


Ala 


Ala 


Gly 


Thr 


Ala 


Glu 


Gly 










45 








He 


Ala 


Val 


Lvs 


Lvs 


Ala 


Gin 


Leu 








60 










Leu 


Glu 


Lys 


Leu 


Gly 


Val 


Tyr 


Ser 






75 










80 


Lys 


Cys 


Asn 


Gly 


Trp 


Lys 


Asn 


Pro 




90 










95 




Gly 


Asp 


Leu 


Gin 


Gin 


He 


He 


Val 


105 










110 






Cvs 


Ser 


His 


Ala 


Leu 


Ala 


Ala 


His 










125 








Glu 


Glu 


Glu 


Met 


Asp 


Arg 


Leu 


Leu 








140 










Leu 


Phe 


Thr 


Cvs 


Val 


His 


Lvs 


Glu 






155 










160 


Tyr 


Phe 


Tyr 


Leu 


Phe 


Lys 


Leu 


Leu 




170 










175 




Lys 


Pro 


Val 


Val 


Glu 


Gly 


Ser 


Leu 


185 










190 






Pro 


Ser 


He 


Glu 


Gin 


Gly 


Val 


Asn 










205 








His 


Leu 


Pro 


Ser 


Lvs 


Glu 


Ara 


Gin 








220 










Phe 


Leu 


Asn 


Arg 


He 


Asn 


Tvr 


Trp 






235 










240 


Arg 


Leu 


Arg 


Ser 


Pro 


Asn 


Asp 


Asp 




250 










255 




Thr 


Ara 


Tro 


Leu 


Cvs 


Tvr 


Cvs 


Asn 


265 










270 






Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


Lys 


Val 










285 








Val 


Phe 


Thr 


lie 


Met 


Ara 


Ara 


Gin 








300 










Lys 


Asp 


Lvs 


Leu 


Pro 


Leu 


Glu 


Lys 






315 










320 


Pro 


Lys 


Phe 


Leu 


Ser 


Met 


Leu 


Glu 




330 










335 




Pro 


lie 


Tro 


Asp 


Gin 


Asd 


Phe 


Leu 


345 










350 






Leu 


Glv 


He 


Gin 


Thr 


Val 


He 


Ser 










365 








Phe 


Ser 


Ser 


Asn 


Ser 


Thr 


Ser 


His 








380 










Ser 


Pro 


Glv 


Cvs 

Jr 


Ara 


Glv 


Ser 


Ser 






395 










400 


Lvs 


Ara 


Lvs 


Met 


Asn 


Asn 


Ser 


His 




410 










415 




Ara 


Val 


Met 


Gly 


Asp 


He 


Pro 


Val 


425 










430 






Thr 


He 


Thr 


Asd 


Pro 


Ala 


Glv 


Met 










445 








Ser 


Ala 


His 


Ser 


Ala 


Arg 


Asp 


Glu 








460 










Gly 


Val 


He 


Glu 


Phe 


His 


Val 


Val 






475 










480 


Asn 


Lys 


Lys 


He 


Leu 


Met 


Trp 


Leu 




490 










495 




His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 


505 










510 
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Glu Tyr He Thr Arg Leu Val Phe Asp Pro Lys His Lys Thr Leu Ala 

515 520 525 

Leu He Lys Asp Gly Arg Val He Gly Gly He Cys Phe Arg Met Phe 

530 535 540 

Pro Ser Gin Gly Phe Thr Glu He Val Phe Cys Ala Val Thr Ser Asn 
545 550 555 560 

Glu Gin Val Lys Gly Tyr Gly Thr His Leu Met Asn His Leu Lys Glu 

565 570 575 

Tyr His He Lys His Glu He Leu Asn Phe Leu Thr Tyr Ala Asp Glu 

580 585 590 

Tyr Ala He Gly Tyr Phe Lys Lys Gin Gly Phe Ser Lys Glu He Lys 

595 600 605 

He Pro Lys Thr Lys Tyr Val Gly Tyr He Lys Asp Tyr Glu Gly Ala 

610 615 620 

Thr Leu Met Gly Cys Glu Leu Asn Pro Gin He Pro Tyr Thr Glu Phe 
625 630 635 640 

Ser Val He He Lys Lys Gin Lys Glu He He Lys Lys Leu He Glu 

645 650 655 

Arg Lys Gin Ala Gin He Arg Lys Val Tyr Pro Gly Leu Ser Cys Phe 

660 665 670 

Lys Asp Gly Val Arg Gin He Pro lie Glu Ser He Pro Gly He Arg 

675 680 685 

Glu Thr Gly Trp Lys Pro Ser Gly Lys Glu Lys Ser Lys Glu Pro Lys 

690 695 700 

Asp Pro Glu His Val Tyr Ser Thr Leu Lys Asn He Leu Gin Gin Val 
705 710 715 720 

Lys Asn His Pro Asn Ala Trp Pro Phe Met Glu Pro Val Lys Arg Thr 

725 730 735 

Glu Ala Pro Gly Tyr Tyr Glu Val He Arg Phe Pro Met Asp Leu Lys 

740 745 750 

Thr Met Ser Glu Arg Leu Arg Asn Arg Tyr Tyr Val Ser Lys Lys Leu 

755 760 765 

Phe Met Ala Asp Leu Gin Arg Val Phe Thr Asn Cys Lys Glu Tyr Asn 

770 775 780 

Pro Pro Glu Ser Glu Tyr Tyr Lys Cys Ala Ser lie Leu Glu Lys Phe 
785 790 795 800 

Phe Phe Ser Lys He Lys Glu Ala Gly Leu He Asp Lys 
805 810 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) . TOPOLOGY : linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

His Thr Lys Gly Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys 

I 5 10 15 

Lys Gin Leu He Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu 

20 25 30 

Asn Lys Cys Pro Val Pro Phe Cys Leu Asn He Lys His Asn Val Arg 
35 40 45 

Gin Gin 
50 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2204 base pairs 



WO 98/03652 



PCT/US97/12877 



89 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACCCACTCCC CCCAGAGCCG ACCTGCAGCA AATAATTGTC AGTCTAACAG AATCCTGTCG 60 
GAGTT GTAGC CATGCCCTAG CTGCTCATGT TTCCCACCTG GAGAATGTGT CAGAGGAAGA 12 0 

AATGAACAGA CTCCTGGGAA TAGTATTGGA TGTGGAATAT CTCTTTACCT GTGTCCACAA 180 

GGAAGAAGAT GCAGATACCA AACAAGTTTA TTTCTATCTA TTTAAGCTCT TGAGAAAGTC 2 40 

TATTTTACAA AGAGGAAAAC CTGTGGTTGG AAGGCTCTTT GGAAAAGAAA CCCCCATTTG 300 

AAAAACCTAG CATT GAAC AG GGTGTGAATA ACTTTGTGCA GTACAAATTT AGTCACCTGC 360 

CAGCAAAAAG AAAGGCAAAC CAATAGTTGA GTT GGCAAAA ATGTTCCTAA ACCGCATCAC 420 

CTATTGGCAT CTGGAGGCAC CATCTCAACG AGACTGCGAT CTCCAATGAT GATATTCTGG 4 80 

ATACAAAGAG AACTACACAA GGTGGCTGTG TTACTGCAAC GTGCCACAGT TCTGCGACAG 54 0 

TCTACCTCGG TACGAAACCA CACAGGTGTT T GGGAGAAC A TCGTTCGCTC GGTCTTCACT 60 0 

GTTATGAGGC GACAACTCCT GGAACAAGCA AGACAG GAAA AAGATAAACT GCCTCTTGAA 660 

AAACGAACTC TAATCCTCAC TCATTTCCCA AAATTTCTGT C CAT GCTAGA AGAAGAAGTA 72 0 

TATAGTCAAA ACTCTCCCAT CTGGGATCAC CATTTTCTCT CAGCCTCTTC CAGAACCAGC 7 80 

CAGCTAGGCA TCCAAACAGT TATCAATCAC CTCCTGTGGC TGGGACAATT TCATACAATT 84 0 

CAACCTCATC TTCCCTTGAG CAGCCAAACG CAGGGAGCAG CAGTCCTGCC TGCAAAGCCT 900 

CTTCTGGACT TGAGGCAAAC C CAGGAGAAA AGAGGAAAAT GACTGATTCT CATGTTCTGG 96 0 

AGGAGGCCAA GAAACCCCGA GTTATGGGGG ATATTCCGAT GGAATTAATC AACGAGGTTA 102 0 

T GT C T AC CAT CACGGACCCT GCAGCAATGC TTGGACCAGA GACCAATTTT CTGTCAGCAC 108 0 

ACTCGGCCAG GG AT GAG GC G GCAAGGTTGG AAGAGCGCAG GGGTGTAATT GAATTT C AC G 1140 

TGGTTGGCAA TTCCCTCAAC CAGAAACCAA ACAAGAAGAT CCTGATGTGG CTGGTTGGCC 12 00 

TACAGAAC GT TTTCTCCCAC CAGCTGCCCC GAATGCCAAA AGAATACATC ACACGGCTCG 12 60 

TCTTTGACCC GAAACACAAA ACCCTTGCTT TAATTAAAGA TGGCCGTGTT ATTGGTGGTA 132 0 

TCTGTTTCCG TATGTTCCCA TCTCAAGGAT TCACAGAGAT TGTCTTCTGT GCTGTAACCT 1380 

CAAATGAGCA AGTCAAGGGC TATGGAACAC AC C T GAT G AA T C AT TT GAAA GAAT AT C AC A 1440 

TAAAGCATGA CATCCTGAAC TTCCTCACAT ATGCAGATGA AT AT GCAATT GGATACTTTA 15 00 

AGAAACAGGG TTTCTCCAAA GAAATTAAAA TACCTAAAAC CAAATATGTT GGCTATATCA 15 60 

AGGATTATGA AGGAGCCACT TTAATGGGAT GT GAGCTAAA TCCACGGATC C C GT AC AC AG 162 0 

AATTTTCTGT CAT CATT AAA AAGCAGAAGG AGATAATTAA AAAACTGATT GAAAGAAAAC 168 0 

AGGCACAAAT TCGAAAAGTT TACCCTGGAC TTTCATGTTT TAAAGATGGA GTT C GACAGA 17 4 0 

TTCCTATAGA AAGCATTCCT GGAATTAGAG AGACAGGCTG GAAACCGAGT G GAAAAGAGA 18 00 

AAAGTAAAGA GCCCAGAGAC CCTGACCAGC TTTACAGCAC GCTCAAGAGC ATCCTCCAGC 18 60 

AGGT GAAGAG CCATCAAAGC GCTTGGCCCT TCATGGAACC T GT GAAGAGA AC AGAAGCT C 192 0 

CAGGATATTA T GAAGTTAT A AGGTCCCCCA TGGATCTCAA AAC CAT GAGT GAACGCCTCA 198 0 

AGAATAGGTA CTACGTGTCT AAGAAAT TAT T CAT G GCAGA CTTACAGCGA GT CTTTACCA 2040 

ATTGCAAAGA GT AC AAC GC C CCTGAGAGTG AATACTACAA ATGTGCCAAT AT C CT GGAGA 210 0 

AATTCTTCTT CAGTAAAATT AAGGAAGCTG GATTAATTGA CAAGTGATTT TTTTTCCCCC 2160 

TCTGCTTCTT AGAAACTCAC CAAGCAGT GT GCCTAAAGCA AGGT 22 0 4 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

GAATTCCGGC GAAACCACTC ATGTCTTTGG GCGAAGCCTT CTCCGGTCCA TTTTCACCGT 60 

TACCCGCCGG CAGCTGCTGG AAAAGTTCCG AGTGGAGAAG GACAAATTGG TGCCCGAGAA 12 0 

GAGGACCCTC ATCCTCACTC ACTTCCCCAA GTAAGGCTCC TTCTGGCCTA CCAGGATTTG 180 

GCCCCAAGTT CACATCCTCC CTGTTGTCCC CTTTTTTCCA GGAAGGCTTC CTGGATTGGT 2 40 

CCCTCCTCTC CCTCCATGGG CCTTTTGGGA TCTGGGCGTC TACCTGGCAG ACTTGCCCAT 3 00 

GGCCCAGAAG CAACTTGCTA GTACTAGTCT GGGGAT GGCA GATTCCTGTC CATGCTGGAG 360 

GAGGAGAT CT ATGGGGCAAA CTCTCCAATC TGGGAGTCAG GCTTCACCAT GCCACCCTCA 42 0 
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GAGGGGACAC AGCTGGTTCC CCGGCCAGCT TCAGTCAGTG CAGCGGTTGT TCCCAGCACC 480 

CCCATCTTGA GCCCCAGCAT GGGT GGGGGC AGCAACAGCT CCCTGAGTCT GGATTCTGCA 54 0 

GGGGCCGAGC CTATGCCAGG CGAGAAGAGG ACGCTCCCAG AGAACCTGAC CCTGGAGGAT 600 

GCCAAGCGGC TCCGTGTGAT GGGT GACATC CCCATGGAGC TGGTCAATGA GGTCATGCTG 660 

AC CAT CACT G ACCCTGCTGC CAT GCT GGGG CCTGAGACGA GCCTGCTTTC GGCCAATGCG 72 0 

GCCCGGGATG AGACAGCCCG CCTGGAGGAG CGCCGCGGCA T CAT C GAGTT C CAT GT CAT C 780 

GGCAACTCAC TGACGCCCAA GGCCAACCGG CGGGTGTTGC TGTGGCTCGT GGGGCT GCAG 8 40 

AATGTCTTTT CCCACCAGCT GCCGCGCATG CCTAAGGAGT ATATCGCCCG CCTCGTCTTT 900 

GAC CCGAAGC ACAAGACTCT GGCCTTGATC AAGGAT GGGC GGGT CAT CGG TGGCATCTGC 960 

TTCCGCATGT TTCCCACCCA GGGCTTCACG GAGATTGTCT TCTGTGCTGT CACCTCGAAT 1020 

GAGCAGGTCA AGGGTT AT GG GACCCACCTG AT GAACCAC C T GAAGGAGT A T C AC AT CAAG 108 0 

C AC AAC AT T C TCTACTTCCT CACCTACGCC GACGAGTACG CCATCGGCTA CTTCAAAAAG 114 0 

CAGGGTTTCT C CAAGGAC AT CAAGGTGCCC AAGAGCCGCT ACCTGGGCTA CAT C AAGGAC 1200 

TAC GAGGGAG C GAC GCT GAT GGAGT GT GAG CTGAATCCCC GCATCCCCTA C AC GGAGC T G 1260 

TCCCACATCA TCAAGAAGCA GAAAGAG AT C AT CAAGAAGC TGATTGAGCG CAAAC AGGC C 132 0 

CAGATCCGCA AGGTCTACCC GGGGCT CAGC TGCTTCAAGG AGGGC GT GAG GCAGATCCCT 138 0 

GT GGAGAGC G TTCCTGGCAT T C GAGAGAC A GGCT GGAAGC C ATT GGGGAA GGAGAAGGGG 144 0 

AAGGAGCTGA AGGACCCCGA CCAGCTCTAC ACAACCCTCA AAAACCTGCT GGCCCAAATC 1500 

AAGTCTCACC CCAGTGCCTG GCCCTTCATG GAG C CT GT GA AGAAGT CGGA GGCCCCTGAC 1560 

TACTACGAGG T CAT CCGCTT CCCCATTGAC CT GAAGACC A T GACT GAGC G GCTGCGAAGC 162 0 

C GCT ACT AC G TGACCCGGAA GCTCTTTGTG GCCGACCTGC AGC GGGT CAT CGCCAACTGT 168 0 

CGCGAGTACA ACCCCCCGGA CAGCGAGTAC TGCCGCTGTG CCAGCGCCCT GGAGAAGTTC 17 40 

TTCTACTTCA AGCT CAAGGA GGGAGGCCTC AT T GACAAGT AGGC C CAT CT TTGGGCCGCA 18 00 

GCCCTGACCT GGAATGTCTC CAC CT C GG AT TCTGATCTGA TCCTTAGGGG GTGCCCTGGC 18 60 

CCCACGGACC CGACTCAGCT TGAGACACTC CAGC C AAGGG TCCTCCGGAC CCGATCCTGC 192 0 

AGCTCTTTCT GGACCTTCAG GCACCCCCAA GCGTGCAGCT CTGTCCCAGC CTTCACTGTG 198 0 

T GT GAGAGGT CTCCTGGGTT GGGGCC CAGC CCCTCTAGAG TAGCTGGTGG CCAGGGATGA 2 0 40 

ACCTTGCCCA GCCGTGGTGG CCCCCAGGCC TGGTCCCCAA GAGCCCGGAA TTC 2 093 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9046 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCTTGTTTGT GTGCTAGGCT GGGGGG GAGA GAGGGCGAGA GAGAGCGGGC GAGAGTGGGC 60 

AAGCAGGACG CCGGGCTGAG TGCTAACTGC GGGAC GCAGA GAGT GCGGAG GGGAGTCGGG 12 0 

T C GGAGAG AG GCGGCAGGGG CCAGAACAGT GGCAGGGGGC CCGGGGCGCA CGGGCTGAGG 18 0 

CGACCCCCAG CCCCCTCCCG TCCGCACACA CCCCCACCGC GGTCCAGCAG CCGGGCCGGC 2 40 

GTCGACGCTA GGGGGGACCA T TAC AT AAC C CGCGCCCCGG CCGTCTTCTC CCGCCGCCGC 300 

GGCGCCCGAA CT GAGC C CGG GGC GGGC GCT CCAGCACTGG CCGCCGGCGT GGGGC GT AGC 360 

AGCGGCCGTA TTATTATTTC GCGGAAAGGA AGGC GAAGGA GGGGAGCGCC GGCGCGAGGA 42 0 

GGGGCCGCCT GCGCCCGCCG CCGGAGCGGG GCCTCCTCGG TGGGCTCCGC GTCGGCGCGG 4 80 

GCGTGCGGGC GGCGCTGCTC GGCCCGGCCC CCTCGGCCCT CTGGTCCGGC CAGCTCCGCT 54 0 

CCCGGCGTCC TTGCCGCGCC TCCGCCGGCC GCCGCGCGAT GTGAGGCGGC GGCGCCAGCC 600 

TGGCTCTCGG CTC GGGC GAG TTCTCTGCGG CCATTAGGGG CCGGTGCGGC GGCGGCGCGG 660 

AGCGCGGCGG CAGGAGGAGG GTTCGGAGGG T GGGGGCGC A GGCCCGGGAG GGGGCAC CGG 720 

GAGGAGGTGA GTGTCTCTTG TCGCC .'"CTC CTCTCCCCCC TTTTCGCCCC CGCCTCCTTG 7 80 

T GGC GAT GAG AAGGAGGAGG ACAGC ' CGA GGAGGAAGAG GTTGAT GGC G GCGGC GGAGC 8 40 

TCCGAGAGAC CTCGGCTGGG CAGGG, *;CGG CCGTGGCGGG CCGGGGACTG CGCCTCTAGA 900 

GCCGC GAGTT CTCGGGAATT CGCCGCAGCG GACCGGCCTC GGCGAATTTG TGCTCTTGTG 960 

CCCTCCTCCG GGCTTGGGCC AGGCCGGCCC CTCGCACTTG CCCTTACCTT TTCTATCGAG 1020 

TCCGCATCCC TCTCCAGCCA CTGCGACCCG GCGAAGAGAA AAAGGAACTT CCCCCACCCC 1080 

CTCGGGTGCC GTCGGAGCCC CCCAGCCCAC CCCTGGGTGC GGCGCGGGGA CCCCGGGCCG 1140 

AAGAAGAGAT TTCCTGAGGA TTCTGGTTTT CCTCGCTTGT ATCTCCGAAA GAATTAAAAA 1200 

TGGCCGAGAA TGTGGTGGAA CCGGGGCCGC CTTCAGCCAA GCGGCCTAAA CTCTCATCTC 1260 

CGGCCCTCTC GGCGTCCGCC AGCGAT GGCA CAGATTTTGG CTCTCTATTT GACTTGGAGC 132 0 

AC GACTTAC C AGATGAATTA AT CAACT CT A CAGAATTGGG ACTAAC CAAT GGTGGTGATA 1380 
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TTAATCAGCT T CAGACAAGT CTTGGCATGG TACAAGATGC AGCTTCTAAA CATAAACAGC 14 4 0 

TGTCAGAATT GCTGCGATCT GGTAGTTCCC CTAACCTCAA TAT GGGAGTT GGTGGCCCAG 15 00 

GTCAAGTCAT GGCCAGCCAG GCCCAACAGA GCAGT CCTGG ATTAGGTTTG ATAAATAGCA 15 60 

TGGTCAAAAG CCCAATGACA CAGGCAGGCT TGACTTCTCC CAACATGGGG ATGGGCACTA 162 0 

GTGGACCAAA TCAGGGTCCT AC GCAGT C AA CAGGTAT GAT GAACAGTCCA GTAAAT C AG C 168 0 

CTGCCATGGG AATGAACACA GGGACGAATG CGGGCATGAA TCCTGGAATG TTGGCTGCAG 17 4 0 

GCAAT GGACA AGGGATAATG CCTAATCAAG TCATGAACGG TTCAATTGGA GCAGGCCGAG 18 00 

GGCGACAGGA TAT GCAGT AC CCAAACCCAG GCATGGGAAG TGCTGGCAAC TTACTGACTG 18 60 

AGCCTCTTCA GCAGGGCTCT CCCCAGATGG GAGGACAAAC AGGATTGAGA GGCCCCCAGC 192 0 

CT CTTAAGAT GGGAAT GAT G AACAACCCCA ATCCTTATGG TT CAC CAT AT ACT C AGAAT C 19 80 

CTGGACAGCA GATTGGAGCC AGTGGCCTTG GTCTCCAGAT TCAGACAAAA ACT GT ACT AT 20 4 0 

CAAATAACTT ATCTCCATTT GCT AT GGACA AAAAGGCAGT TCCTGGTGGA GGAATGCCCA 2100 

ACATGGGTCA ACAGCCAGCC CCGCAGGTCC AGCAGCCAGG TCTGGTGACT CCAGTTGCCC 2160 

AAGGGATGGG TTCTGGAGCA CAT AC AGC T G AT C C AGAGAA GCGCAAGCTC ATCCAGCAGC 22 2 0 

AGCTTGTTCT CCTTTTGCAT GCT CACAAGT GCCAGCGCCG GGAACAGGCC AATGGGGAAG 22 8 0 

TGAGGCAGTG CAACCTTCCC CACTGTCGCA CAAT GAAGAA TGTCCTAAAC CACATGACAC 2 3 40 

ACTGGCAGTC AGGCAAGTCT TGCCAAGTGG CAC ACT GT GC ATCTTCTCGA CAAATCATTT 2 4 00 

CAC ACT GGAA GAATTGTACA AGACATGATT GTCCTGTGTG TCTCCCCCTC AAAAAT GCT G 2 4 60 

GT GAT AAGAG AAATCAACAG CCAATTTTGA CTGGAGCACC CGTTGGACTT GGAAAT CCTA 2 52 0 

GCTCTCTAGG GGTGGGTCAA CAGTCTGCCC CCAACCTAAG CACTGTTAGT C AG AT T GAT C 2 58 0 

CCAGCTCCAT AGAAAGAGCC TAT GCAGCT C TTGGACTACC CT AT CAAGT A AATCAGATGC 2 640 

CGACACAACC C CAGGT GCAA GC AAAGAAC C AGCAGAATCA GCAGCCT GGG CAGTCTCCCC 27 00 

AAGGCAT GC G GCCCATGAGC AACATGAGTG CTAGTCCTAT GGGAGTAAAT GGAGGT GTAG 27 60 

GAGTTCAAAC GCCGAGTCTT CTTTCTGACT CAAT GT T GC A TTCAGCCATA AATTCTCAAA 2 8 20' 

ACCCAATGAT GAGT GAAAAT GCCAGTGTGC CCTCCCTGGG TCCTATGCCA ACAGCAGCTC 2 8 80 

AACCATCCAC TACTGGAATT CGGAAACAGT GGCACGAAGA TATTACTCAG GATCTTCGAA 2940 

AT CAT CT T GT TCACAAACTC GTCCAAGCCA TATTTCCTAC GCCGGATCCT GCTGCTTTAA 3000 

AAGACAGACG GAT GGAAAAC CTAGTTGCAT AT GCT C GGAA AGTTGAAGGG GACATGTATG 3060 

AATCTGCAAA CAATCGAGCG GAAT ACT AC C AC CTTCT AGC TGAGAAAATC TATAAGATCC 312 0 

AGAAAGAACT AGAAGAAAAA C GAAGGAC C A GACTACAGAA GCAGAACATG CTACCAAATG 318 0 

CTGCAGGCAT GGTTCCAGTT T C CAT GAAT C CAGGGC CTAA CAT GG G AC AG CCGCAACCAG 32 40 

GAATGACTTC TAATGGCCCT CTACCTGACC CAAGTATGAT CCGTGGCAGT GTGCCAAACC 33 00 

AGAT GAT GC C T C GAAT AAC T CCACAATCTG GT T T GAAT CA ATTTGGCCAG AT GAGCAT GG 3 3 60 

CCCAGCCCCC TATTGTACCC CGGCAAACCC CTCCTCTTCA GC AC CAT GGA CAGTTGGCTC 342 0 

AACCTGGAGC TCTCAACCCG CCTATGGGCT ATGGGCCTCG TAT GC AAC AG CCTTCCAACC 34 8 0 

AGGGCCAGTT CCTTCCTCAG ACT CAGT T C C CAT C AC AGGG AATGAATGTA ACAAATATCC 3 540 

CTTTGGCTCC GTCCAGCGGT CAAGCTCCAG TGTCTCAAGC ACAAATGTCT AGTTCTTCCT 3 600 

GCCCGGTGAA CTCTCCTATA ATGCCTCCAG GGTCTCAGGG GAGC CAC ATT CACTGTCCCC 3 660 

AGCTTCCTCA ACCAGCTCTT CATCAGAATT CACCCTCGCC TGTACCTAGT CGTACCCCCA 37 20 

CCCCTCACCA TACTCCCCCA AGCATAGGGG CTCAGCAGCC ACCAGCAACA ACAATTCCAG 37 80 

CCCCTGTTCC TACACCACCA GCCATGCCAC CTGGGCCACA GTCCCAGGCT CTACATCCCC 38 4 0 

CTCCAAGGCA GACAC CTACA CCACCAACAA CACAACTTCC CCAACAAGTG CAGCCTTCAC 3900 

TTCCTGCTGC ACCTTCTGCT GACCAGCCCC AGCAGCAGCC TCGCTCACAG CAGAGCACAG 3960 

CAGCGTCTGT TCCTACCCCA AACGCACCGC TGCTTCCTCC GCAGCCTGCA ACTCCACTTT 4 02 0 

CCCAGCCAGC TGTAAGCATT GAAGGACAGG TATCAAATCC TCCATCTACT AGTAGCACAG 4080 

AAGTGAATTC TCAGGCCATT GCTGAGAAGC AGCCTTCCCA GGAAGT GAAG AT GGAGGC CA 4140 

AAAT GGAAGT GGATCAAC C A GAACCAGCAG ATAC GCAGCC GGAGGATATT TCAGAGTCTA 42 00 

AAGTGGAAGA CTGTAAAATG GAATCTACCG AAACAGAAGA GAGAAGCACT GAGTTAAAAA 42 60 

CT GAAAT AAA AGAGGAGGAA GACCAGCCAA GTACTTCAGC T AC C CAGT C A TCTCCGGCTC 4320 

CAGGACAGTC AAAGAAAAAG ATTTTCAAAC CAGAAGAACT ACGACAGGCA CTGATGCCAA 438 0 

CATT GGAGGC ACTTTACCGT CAGGATCCAG AATCCCTTCC CT TT C GT CAA CCTGTGGACC 44 4 0 

CTCAGCTTTT AGGAAT CCCT GAT TACTTT G ATATTGTGAA GAGCCCCATG GATCTTTCTA 4 50 0 

CCATTAAGAG GAAGTTAGAC ACTGGACAGT AT CAGGAGC C CTGGCAGTAT GT C GAT GAT A 4 560 

TTTGGCTTAT GTTCAATAAT GCCTGGTTAT ATAACCGGAA AAC AT CAC GG GTATACAAAT 4 62 0 

ACTGCTCCAA GCTCTCTGAG GTCTTTGAAC AAGAAATTGA CCCAGTGATG CAAAGCCTTG 4 68 0 

GATACTGTTG TGGCAGAAAG TTGGAGTTCT CTCCACAGAC ACTGTGTTGC T AC GGCAAAC 47 4 0 

AGTTGTGCAC AATACCTCGT GATGCCACTT AT TAC AGTT A CCAGAACAGG TATCATTTCT 4 8 00 

GTGAGAAGTG TT T CAAT GAG ATCCAAGGGG AGAGCGTTTC TTTGGGGGAT GACCCTTCCC 4 8 60 

AGCCTCAAAC TACAATAAAT AAAGAACAAT T T T C C AAGAG AAAAAATGAC ACACTGGATC 4920 

CTGAACTGTT TGTTGAATGT AC AGAGT GC G GAAGAAAGAT G CAT CAG AT C TGTGTCCTTC 4 98 0 

ACCAT GAGAT CATCTGGCCT GCTGGATTCG TCTGTGATGG CTGTTTAAAG AAAAGT GC AC 5 04 0 

GAACTAGGAA AGAAAATAAG TTTTCTGCTA AAAGGTTGCC AT CT AC C AGA CTTGGCACCT 5100 

TT CT AGAGAA T C GT GT GAAT GACTTTCTGA GGC GACAGAA TCACCCTGAG TCAGGAGAGG 5160 
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TCACTGTTAG AGTAGTTCAT GCTTCTGACA AAACCGTGGA AGTAAAACCA GGCATGAAAG 522 0 

CAAGGTTTGT GGACAGTGGA GAGAT GGCAG AATCCTTTCC AT AC C GAAC C AAAGCCCTCT 5280 

TTGCCTTTGA AGAAATTGAT GGTGTTGACC TGTGCTTCTT TGGCATGCAT GTTCAAGAGT 5 34 0 

ATGGCTCTGA CTGCCCTCCA CCCAACCAGA GGAGAGTATA CAT AT CT T AC CTCGATAGTG 5 400 

TTCATTTCTT CCGTCCTAAA TGCTTGAGGA CTGCAGTCTA T CAT GAAAT C CTAATTGGAT 5460 

ATTTAGAATA TGTCAAGAAA TTAGGTTACA CAACAGGGCA TATTTGGGCA TGTCCACCAA 5 520 

GTGAGGGAGA TGATTATATC TTCCATTGCC ATCCTCCTGA C C AGAAGAT A CCCAAGCCCA 558 0 

AGCGACT GCA GGAAT GGT AC AAAAAAATGC TTGACAAGGC T GT AT C AG AG CGTATTGTCC 5640 

ATGACTACAA GGATATTTTT AAACAAGCTA CT GAAGAT AG ATTAACAAGT GCAAAGGAAT 57 00 

TGCCTTATTT C GAGGGT GAT TTCTGGCCCA ATGTTCTGGA AGAAAGCATT AAGGAACTGG 57 60 

AACAGGAGGA AGAAGAGAGA AAACGAGAGG AAAACAC C AG CAATGAAAGC ACAGATGTGA 582 0 

CCAAGGGAGA CAGCAAAAAT GCTAAAAAGA AGAATAATAA GAAAACCAGC AAAAATAAGA 58 80 

GCAGC CTGAG TAGGGGCAAC AAGAAGAAAC CCGGGATGCC CAATGTATCT AACGACCTCT 5940 

CACAGAAACT ATATGCCACC AT GGAGAAGC ATAAAGAGGT CTTCTTTGTG ATCCGCCTCA - 6000 

TTGCTGGCCC TGCTGCCAAC TCCCTGCCTC CCATTGTTGA TCCTGATCCT CTCATCCCCT 6060 

GCGATCT GAT GGATGGTCGG GATGCGTTTC TCACGCTGGC AAGGGACAAG C AC CT GGAGT 612 0 

TCT CTTCACT CC GAAGAGC C CAGTGGTCCA C CAT GT GC AT GCTGGTGGAG CTGCACACGC 6180 

AGAGC CAGGA CCGCTTTGTC TACACCTGCA ATGAATGCAA GCAC CAT GT G GAGACAC G CT 62 4 0 

GGCACT GTAC TGTCTGTGAG GATTATGACT T GT GT AT C AC CTGCTATAAC ACTAAAAACC 6300 

ATGACCACAA AATGGAGAAA CTAGGCCTTG GCTTAGATGA T GAGAGCAAC AACCAGCAGG 63 60 

CTGCAGCCAC CCAGAGCCCA GGCGATTCTC GCCGCCTGAG TAT C CAGCGC TGCATCCAGT 642 0 

CTCTGGTCCA TGCTTGCCAG T GT C GGAAT G CCAATTGCTC ACTGCCATCC T GC CAGAAGA 64 8 0 

TGAAGCGGGT T GT GCAGC AT AC CAAGGGTT GCAAAC GGAA AACCAATGGC GGGTGCCCCA 654 0 

TCTGCAAGCA GCTCATTGCC CTCTGCTGCT AC CAT GC C AA GCACTGCCAG GAGAACAAAT 6600 

GCCCGGTGCC GTTCTGCCTA AACATCAAGC AGAAGCTCCG GCAGCAACAG CTGCAGCACC 6660 

GACTACAGCA GGCCCAAATG CTTCGCAGGA GGATGGCCAG CAT GCAGC GG ACTGGTGTGG 67 2 0 

TTGGGCAGCA ACAGGGCCTC CCTTCCCCCA CTCCTGCCAC TCCAACGACA CCAACTGGCC 67 8 0 

AACAGCCAAC CACCCCGCAG ACGCCCCAGC CCACTTCTCA GCCTCAGCCT ACCCCTCCCA 68 40 

ATAGCATGCC ACCCTACTTG CCCAGGACTC AAGCTGCTGG CCCTGTGTCC CAGGGTAAGG 69 00 

■ CAGCAGGC CA GGTGACCCCT CCAACCCCTC CTCAGACTGC TCAGCCACCC CTTCCAGGGC 69 60 

CCCCACCTAC AGCAGT GGAA AT GGCAATGC AGATTCAGAG AGCAGCGGAG ACGCAGCGCC 7 02 0 

AGATGGCCCA CGTGCAAATT TTTCAAAGGC CAATCCAACA CCAGATGCCC C C GAT G ACT C 708 0 

CCATGGCCCC CAT GGGT AT G AACCCACCTC CCATGACCAG AGGTCCCAGT GGGCATTT GG 7140 

AGC CAGGGAT GGGACCGACA GGGAT GCAGC AACAGCCACC CTGGAGCCAA GGAGGATT GC 7 2 00 

CTCAGCCCCA GCAACTACAG TCT GGGAT GC CAAGGC CAGC CATGATGTCA GTGGCCCAGC 7 2 60 

AT GGT C AAC C TTTGAACATG GCTCCACAAC CAGGATTGGG CCAGGTAGGT ATCAGCCCAC 7 32 0 

TCAAACCAGG CACTGTGTCT CAACAAGCCT T AC AAAAC CT TTTGCGGACT CTCAGGTCTC 73 80 

CCAGCTCTCC CCT GCAGC AG CAACAGGTGC TTAGTATCCT TCACGCCAAC CCCCAGCTGT 74 40 

TGGCTGCATT CAT C AAGCAG CGGGCTGCCA AGTATGCCAA CTCTAATCCA CAACCCATCC 7 500 

CTGGGCAGCC TGGCATGCCC CAGGGGCAGC CAGGGCTACA GCCACCTACC ATGCCAGGTC 7 5 60 

AGCAGGGGGT CCACTCCAAT CCAGCCATGC AGAAC AT GAA TCCAATGCAG GCGGGCGTTC 7 62 0 

AGAGGGCTGG CCTGCCCCAG CAGCAACCAC AGC AGC AAC T CCAGCCACCC AT GGGAGGGA 7 68 0 

TGAGCCCCCA GGCT CAGCAG ATGAACATGA ACCACAACAC CATGCCTTCA CAATTCCGAG 7 7 40 
ACATCTTGAG ACGACAGCAA ATGAT GCAAC AGCAGCAGCA ACAGGGAGCA GGGCCAGGAA 7 800 
TAGGCCCTGG AATGGCCAAC CAT AAC C AGT TCCAGCAACC CCAAGGAGTT GGCTACCCAC 7 860 
CACAGCCGCA GCAGC GGATG CAGCATCACA TGCAACAGAT GCAACAAGGA AAT AT GGGAC 7 92 0 
AGATAGGCCA GCTTCCCCAG GCCTTGGGAG CAGAGGCAGG T GC CAGTCT A CAGGCCTATC 7 98 0 
AGCAGC GACT CCTTCAGCAA CAGAT GGGGT CCCCTGTTCA GCCCAACCCC ATGAGCCCCC 8040 
AGCAGCATAT GCTCCCAAAT CAGGCCCAGT CCCCACACCT ACAAGGCCAG CAGATCCCTA 8100 
ATTCTCTCTC CAATCAAGTG CGCTCTCCCC AGCCTGTCCC TTCTCCACGG CCACAGTCCC 8160 
AGCCCCCCCA CTCCAGTCCT TCCCCAAGGA TGCAGCCTCA GCCTTCTCCA CAC CACGTTT 822 0 
CCCCACAGAC AAGTTCCCCA CATCCTGGAC TGGTAGCTGC CCAGGCCAAC CCCATGGAAC 828 0 
AAGGGCATTT TGCCAGCCCG GACCAGAATT CAATGCTTTC TCAGCTTGCT AGCAAT C C AG 8 340 
GCATGGCAAA CCT C CAT GGT GCAAGCGCCA CGGACCTGGG ACTCAGCACC GAT AACT C AG 8 400 
ACTTGAATTC AAACCTCTCA CAGAGTACAC TAGACATACA CTAGAGACAC CTTGTATTTT 8460 
GGGAG CAAAA AAATTATTTT CTCTTAACAA GACTTTTTGT ACTGAAAACA ATTTTTTTGA 8 520 
ATCTTTCGTA GC CT AAAAGA CAATTTTCCT T G GAAC AC AT AAGAACT GT G CAGTAGCCGT 858 0 
TTGTGGTTTA AAGCAAACAT GC AAGAT GAA CCTGAGGGAT GATAGAATAC AAAGAATATA 8 640 
TTTTTGTTAT GGGCTGGTTA CCACCAGCCT TTCTTCCCCT TTGTGTGTGT GGTTCAAGTG 87 00 
TGCACTGGGA GGAGGCTGAG GCCTGTGAAG CCAAACAATA TGCTCCTGCC TTGCACCTCC 87 60 
AATAGGTTTT ATTATTTTTT TTAAATTAAT GAAC AT AT GT AAT ATT AAT G AAC AT AT GT A 8 820 
ATATTAATAG TTATTATTTA CT GGT GC AGA TGGTTGACAT TTTTCCCTAT TTTCCTCACT 8 88 0 
T TAT GGAAGA GT T AAAACAT TTCTAAACCA GAGGACAAAA GGGGTTAATG TTACTTTGAA 8 94 0 
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AT T AC ATT CT ATATATATAT AAATATATAT AAATATATAT TAAAATACCA GTTTTTTTTC 90 00 
TCTGGGTGCA AAGATGTTCA TTCTTTTAAA AAATGTTTAA AAAAAA 90 4 6 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7326 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

AT GGC C GAGA ACTTGCTGGA CGGACCGCCC AACCCCAAAC GAGCCAAACT CAGCTCGCCC 60 

GGCTTCTCCG CGAATGACAA CACAGATTTT GGAT CATTGT TTGACTTGGA AAATGACCTT 120 

CCTGATGAGC TGATCCCCAA TGGAGAATTA AGCCTTTTAA ACAGTGGGAA CCTTGTTCCA 180 

GATGCTGCGT CCAAACATAA ACAACTGTCA GAGCTTCTTA GAGGAGGCAG CGGCTCTAGC 2 40 

AT C AAC C C AG GGATAGGCAA TGTGAGTGCC AGCAGCCCTG TGCAACAGGG CCTTGGTGGC 300 

CAGGCTCAGG GGCAGCCGAA CAGTACAAAC ATGGCCAGCT TAG GT G C CAT GGGCAAGAGC 360 

CCTCTGAACC AAGGAGACTC ATCAACACCC AACCTGCCCA AACAGGCAGC CAGCACCTCT 4 20 

GGGCCCACTC CCCCTGCCTC CCAAGCACTG AATCCACAAG CACAAAAGCA AGTAGGGCTG 4 80 

GTGACCAGTA GTCCTGCCAC AT C ACAGAC T GGACCTGGGA TCTGCATGAA TGCTAACTTC 54 0 

AACCAGACCC ACCCAGGCCT TCTCAATAGT AACTCTGGCC ATAGCTTAAT GAAT CAGGCT 600 

CAACAAGGGC AAGCTCAAGT CATGAATGGA TCTCTTGGGG CTGCTGGAAG AGGAAGGGGA 660 

GCTGGAATGC CCTACCCTGC TCCAGCCATG CAGGGGGCCA CAAGCAGTGT GCTGGCGGAG *7 2 0 

ACCTTGACAC AGGTTTCCCC ACAAATGGCT GGCCATGCTG GACTAAATAC AGCACAGGCA 7 80 

GGAGGC AT GA CCAAGATGGG AATGACTGGT ACCACAAGTC CAT T T G GAC A ACCCTTTAGT 8 40 

CAAACTGGAG GGCAGCAGAT GGGAGCCACT GGAGTGAACC CCCAGTTAGC CAGCAAACAG 900 

AGCATGGTCA ATAGTTTACC TGCTTTTCCT ACAGAT AT CA AGAATACTTC AGTCACCACT 960 

GTGCCAAATA TGTCCCAGTT GCAAACAT CA GTGGGAATTG TACCCACACA AGCAATTGCA 102 0 

ACAGGCCCCA CAGCAGACCC TGAAAAACGC AAACT GATAC AGCAGCAGCT GGTTCTACTG 108 0 

CTTCATGCCC ACAAATGTCA GAGACGAGAG CAAGCAAATG GAGAGGTTCG NGCCTGTTCT 1140 

CTCCCACACT GT C GAAC CAT GAAAAACGTT TT GAAT C AC A TGACACATTG TCAGGCTCCC 12 00 

AAAGCCTGCC AAGTTGCCCA TTGTGCATCT T C AC GAC AAA TCATCTCTCA TT GGAAGAAC 12 60 

T GC AC AC GAC ATGACTGTCC TGTTTGCCTC CCTTTGAAAA ATGCCAGTGA CAAGCGAAAC 1.320 

CAACAAACCA TCCTGGGATC TCCAGCTAGT GGAAT T C AAA ACACAATTGG TTCTGTTGGT 1380 

GCAGGGCAAC AGAATGC C AC TTCCTTAAGT AACCCAAATC CCATAGACCC CAGTT C CAT G 144 0 

CAGCGGGCCT ATGCTGCTCT AGGACTCCCC T AC AT GAAC C AGCCTCAGAC GCAGCTGCAG 1500 

CCTCAGGTTC CTGGCCAGCA AC CAGC AC AG CCTCCAGCCC ACCAGCAGAT GAGGACTCTC 1560 

AATGCCCTAG GAAACAACCC CATGAGTGTC CCAGCAGGAG GAATAACAAC AGAT C AAC AG 162 0 

CCACCAAACT TGATTTCAGA ATCAGCTCTT CCAACTTCCT TGGGGGCTAC CAAT C C ACT G 168 0 

ATGAAT GATG GTTCAAACTC T GGTAAC AT T GGAAGC CTCA GC AC GATAC C TACAGCAGCG 17 4 0 

CCTCCTTCCA GCACT GGTGT TCGAAAAGGC T GGCAT GAAC ATGTGACTCA GGACCTACGG 1800 

AGT CAT CT AG T C CAT AAACT CGTTCAAGCC ATCTTCCCAA CTCCAGACCC TGCAGCTCTG 18 60 

AAAGATCGCC GC AT GGAGAA CCTGGTTGCC TATGCTAAGA AAGT GGAGGG AGACATGTAT 192 0 

GAGTCTGCTA ATAGCAGGGA T GAATACT AT CATTTATTAG CAGAGAAAAT CTATAAAATA 198 0 

CAAAAAGAAC TAGAAGAAAA GCGGAGGACA CGTTTACATA AGCAAGGCAT CCTGGGTAAC 2 04 0 

CAGCCAGCTT TACCAGCTTC TGGGGCTCAG CCCCCTGTGA TTCCACCAGC CCAGTCTGTA 2100 

AGACCTCCAA ATGGGCCCCT GCCTTTGCCA GT GAAT C G C A TGCAGGTTTC TCAAGGGATG 2160 

AATTCATTTA AC C CAAT GT C CCTGGGAAAC GTCCAGTTGC CACAGGCACC C ATGGGAC CT 222 0 

CGTGCAGCCT CCCCTATGAA CCACTCTGTG C AG AT GAAC A GCATGGCCTC AGTTCCGGGT 22 8 0 

ATGGCCATTT CTCCTTCACG GATGCCTCAG CCTCCAAATA TGATGGGCAC T CAT GC C AAC 2 340 

AACATTATGG CCCAGGCACC TACTCAGAAC CAGTTTCTGC CACAGAACCA GTTTCCATCA 2 4 00 

TCCAGTGGGG CAAT GAGT GT GAACAGT GTG GGCATGGGGC AACCAGCAGC CCAGGCAGGT 2 4 60 

GTTTCACAGG GT CAG GAAC C TGGAGCTGCT CTCCCTAACC C T C T GAAC AT GCTGGCACCC 2 520 

CAGGCCAGCC AGCTGCCTTG CCCACCAGTG ACACAGTCAC CATTGCACCC GACTCCACCT 2 58 0 

CCTGCTTCCA CAGCTGCTGG CATGCCCTCT CTCCAACATC CAACGGCACC AGGAAT GAC C 2 640 

CCTCCTCAGC CAGCAGCTCC CACTCAGCCA TCTACTCCTG TGTCATCTGG GCAGACTCCT 2700 

ACCCCAACTC CTGGCTCAGT GCCCAGCGCT GCCCAAACAC AGAGTACCCC T AC AGT C CAG 27 60 

GCAGCAGCAC AGGCTCAGGT GACT C C AC AG CCTCAGACCC CAGTGCAGCC AC CAT CT GTG 282 0 

GCTACTCCTC AGT CAT CAC A GC AG C AAC C A ACGCCTGTGC AT ACT C AGCC ACCTGGCACA 28 80 

CCGCTTTCTC AGGCAGCAGC CAGCATTGAT AATAGAGTCC CTACTCCCTC CAC T GT GAC C 2 94 0 
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AGTGCTGAAA CCAGTTCCCA GCAGCCAGGA CCCGATGTGC CCATGCTGGA AATGAAGACA 3000 

GAGGTGCAGA CAGATGATGC TGAGCCTGAA C CT AC T GAAT CCAAGGGGGA ACCTCGGTCT 3060 

GAGATGATGG AAGAGGATTT ACAAGGTTCT TCCCAAGTAA AAGAAGAGAC AGATACGACA 312 0 

GAG C AGAAGT CAGAGCCAAT GGAAGTAGAA GAAAAGAAAC CTGAAGTAAA AGTGGAAGCT 3180 

AAAGAGGAAG AAGAGAACAG TTCGAACGAC ACAGCCTCAC AATCAACATC TCCTTCCCAG 32 4 0 

CCACGCAAAA AAATCTTTAA ACCCGAGGAG CTACGCCAGG C AC TT AT GC C AACT CTAGAA 33 00 

GCACTCTATC GACAGGACCC AGAGTCTTTG CCTTTTCGTC AGCCTGTAGA TCCTCAGCTC 3 3 60 

CTAGGAATCC CAGATTATTT TGATATAGTG AAGAATCCTA TGGACCTTTC T AC CAT C AAA 342 0 

CGAAAGCTGG ACACAGGGCA AT AT CAAGAA CCCTGGCAGT AT GTGGAT GA TGTCAGGCTT 34 8 0 

ATGTTCAACA ATGCGTGGCT ATATAATCGT AAAACGTCCC GTGTATATAA ATTTTGCAGT 35 40 

AAACTTGCAG AGGTCTTTGA ACAAGAAATT GACCCTGTCA TGCAGTCTCT TGGATATT GC 3600 

TGTGGACGAA AGTATGAGTT CTCCCCACAG ACTTTGTGCT GTT AC GGAAA GCAGCTGTGT 3660 

ACAATTCCTC GT GAT GCAGC CTACTACAGC TAT C AGAAT A GGTATCATTT CTGTGGGAAG 372 0 

TGTTTCACAG AGATCCAGGG CGAGAATGTG ACCCTGGGTG ACGACCCTTC CCAACCTCAG 37 8 0 

ACGACAATTT CCAAGGATCA ATTTGAAAAG AAGAAAAATG ATACCTTAGA TCCT GAACCT 38 4 0 

TTTGTTGACT GCAAAGAGTG TGGCCGGAAG AT G CAT C AGA TTTGTGTTCT AC ACTAT GAC 3900 

ATCATTTGGC CTT CAGGTTT T GT GT GT GAC AACTGTTTGA AGAAAACTGG CAGACCTCGG 3960 

AAAGAAAACA AATTCAGT GC TAAGAGGCTG CAGACCACAC GAT T GGGAAA CCACTTAGAA 402 0 

GACAGAGTGA ATAAGTTTTT GCGGCGCCAG AATCACCCTG AAGCTGGGGA GGTTTTTGTC 408 0 

AGAGTGGTGG CCAGCTCAGA CAAGACTGTG GAGGTCAAGC CGGGAATGAA GTCAAGGTTT 414 0 

GTGGATTCTG GAGAGATGTC GGAATCTTTC C CAT AT C GT A CCAAAGCACT CTTTGCTTTT 42 00 

G AGGAG AT C G AT GGAGT CGA TGTGTGCTTT TTTGGGATGC AT GT GCAAGA TACGGCTCTG 4260 

ATTGCCCCCC ACCAAATACA AGGCTGTGTA T AC AT AT CTT AT CT GGAC AG T ATT CAT T T C 4320 

TTCCGGCCCC GCTGCCTCCG GACAGCT GT T TACCATGAGA TCCTCATCGG ATATCTCGAG 4 38 0 

TATGTGAAGA AATTGGTGTA TGTGACAGCA CATATTTGGG CCTGTCCCCC AAGT GAAGGA 4440 

GAT GACTAT A TCTTTCATTG CCACCCCCCT GAC C AGAAAA TCCCCAAACC AAAAC GACT A 4 50 0 

CAGGAGT GGT ACAAGAAGAT GCTGGACAAG GCGTTTGCAG AGAGGAT CAT TAAC GACTAT 4 560 

AAGGACATCT TCAAACAAGC GAAC GAAGAC AGGCTCACGA GTGCCAAGGA GTTGCCCTAT 4 62 0 

TTTGAAGGAG ATTTCTGGCC TAATGTGTTG GAAGAAAGCA TTAAGGAACT AGAACAAGAA 4 68 0 

GAAGAAGAAA GGAAAAAAGA AGAGAGT AC T GCAGCGAGTG AGACTCCTGA GGGCAGTCAG 47 4 0 

GGTGACAGCA AAAATGCGAA GAAAAAGAAC AACAAGAAGA CCAACAAAAA CAAAAGCAGC 480 0 

ATTAGCCGCG CCAACAAGAA GAAGCCCAGC ATGCCCAATG TTTCCAACGA CCTGTCGCAG 4860 

AAGCTGTATG C C AC CAT GG A GAAGCACAAG GAGGTATTCT TTGTGATTCA TCTGCATGCT 492 0 

GGGCCTGTTA T CAGCACT C A GCCCCCCATC GTGGACCCTG ATCCTCTGCT TAGCTGTGAC 4 98 0 

CT CAT GGAT G GGCGAGATGC CTTCCTCACC CTGGCCAGAG ACAAGCACTG GGAATTCTCT 504 0 

TCCTTACGCC GCTCCAAATG GTCCACTCTG TGCATGCTGG TGGAGCTGCA CACACAGGGC 5100 

CAGGACCGCT TTGTTTATAC CT GCAAT GAG TGCAAACACC AT GTGGAAAC ACGCT GGCAC 516C 

TGCACTGTGT GTGAGGACTA TGACCTTTGT ATCAATTGCT ACAACACAAA GAGCCACACC 522 0 

CATAAGATGG TGAAGTGGGG GCTAGGCCTA GATGATGAGG GCAGCAGTCA GGGTGAGCCA 5280 

CAGTCCAAGA GCCCCCAGGA ATCCCGGCGT CTCAGCATCC AGCGCTGCAT CCAGTCCCTG 5 34 0 

GTGCATGCCT GCCAGTGTCG CAATGC CAAC TGCTCACTGC CGTCTTGCCA GAAGATGAAG 54 00 

CGAGTCGTGC AGCACACCAA GGGCT GCAAG CGCAAGACTA AT GGAGGAT G CCCAGTGTGC 5460 

AAGCAGCTCA TTGCTCTTTG CTGCTACCAC GCCAAACACT GC CAAGAAAA TAAATGCCCT 5 520 

GTGCCCTTCT GCCTCAACAT CAAACATAAC GTCCGCCAGC AGCAGAT CC A GCACTGCCTG 558 0 

CAGCAGGCTC AGCTCATGCG CCGGCGAATG GC AAC CAT GA ACACCCGCAA TGTGCCTCAG 5640 

CAGAGTTTGC CTTCTCCTAC CTCAGCACCA CCCGGGACTC CTACACAGCA GCCCAGCACA 5700 

CCCCAAACAC CACAGCCCCC AGCCCAGCCT CAGCCTTCAC CTGT TAACAT GTCACCAGCA 5760 

GGCTTCCCTA ATGTAGCCCG GACTCAGCCC CCAACAATAG TGTCTGCTGG GAAGC CT AC C 5820 

AACCAGGTGC CAGCTCCCCC ACCCCCTGCC CAGCCCCCAC CTGCAGCAGT AGAAGCAGCC 5880 

CGGCAAATTG AACGTGAGGC C CAGCAGC AG CAGCACCTAT ACC GAGCAAA CAT C AAC AAT 5940 

GGCATGCCCC CAGGAC GT GA CGGTATGGGG ACCCCAGGAA GCCAAAT GAC TCCTGTGGGC 6000 

CTGAAT GTGC CCCGTCCCAA CCAAGT CAGT GGGCCTGTCA TGTCTAGTAT GCCACCTGGG 6060 

CAGTGGCAGC AGGCACCCAT CCCTCAGCAG CAGCCGATGC CAGGCATGCC CAGGCCTGTA 6120 

ATGTCCATGC AGGCCCAGGC AGCAGTGGCT GGGCCACGGA TGCCCAATGT GCAGC CAAAC 6180 

AGGAGCATCT CGCCAAGTGC CCTGCAAGAC CTGCTACGGA CCCTAAAGTC ACCCAGCTCT 6240 

CCTCAGCAGC AGCAGCAGGT GCTGAACATC CTTAAATCAA ACCCACAGCT AATGGCAGCT 6300 

TT CAT CAAAC AGCGCACAGC C AAGT AT GT G GCCAATCAGC CTGGCATGCA GCCCCAGCCC 63 60 

GGACTT CAAT CCCAGCCTGG TAT GCAGC C C CAGCCTGGCA TGCACCAGCA GCCTAGTTTG 642 0 

CAAAACCTGA AC GCAAT GCA AGCTGGTGTG CCACGGCCTG GTGTGCCTCC ACCACAACCA 64 8 0 

GCAAT GGGAG GC CT GAAT C C C CAG GGAC AA GCTCTGAACA T CAT GAAC CC AGGACACAAC 654 0 

CCCAACATGA CAAACATGAA TCCACAGTAC CGAGAAATGG T GAGGAGAC A GCTGCTACAG 6600 

CACCAGCAGC AGCAGCAGCA ACAGC AGC AG CAGCAGCAGC AACAACAAAA TAGTGCCAGC 6660 

TTGGCCGGGG GCATGGCGGG ACACAGC CAG TTCCAGCAGC CACAAGGACC TGGAGGTTAT 67 20 



WO 98/03652 



PCT/US97/ 12877 



95 

GCCCCAGCCA TGCAGCAGCA AC GC AT G CAA CAGCACCTCC CCATCCAGGG CAGCTCCATG 67 80 

GGCCAGATGG CTGCTCCAAT GGGACAACTT GGCCAGATGG GGCAGCCTGG GCTAGGGGCA 68 4 0 

GACAGCACCC CTAATAT C C A GCAGGCCCTG CAGCAACGGA TTCTGCAGCA GCAGCAGATG 69 00 

AAGCAACAAA TTGGGTCACC AGGCCAGCCG AACCCCATGA GCCCCCAGCA GCACATGCTC 6960 

TCAGGACAGC CACAGGCCTC ACATCTCCCT GGCCAGCAGA TCGCCACATC CCTTAGTAAC 7 02 0 

CAGGTGCGAT CTCCAGCCCC TGTGCAGTCT CCACGGCCCC AATCCCAACC TCCACATTCC 7 08 0 

AGCCCGTCAC CACGGATACA ACCCCAGCCT TCACCACACC ATGTTTCACC CCAGACTGGA 714 0 

ACCCCTCACC CTGGACTCGC AGT C AC CAT G GCCAGCTCCA TGGATCAGGG ACACCTGGGG 72 00 

AACCCTGAAC AGAGTGCAAT GCTCCCCCAG CTGAATACCC CCAACAGGAG CGCACTGTCC 72 60 

AGT GAACTGT CCCTGGTTGG T GAT AC C AC G GGAGACACAC TAGAAAAGTT TGTGGAGGGT 7 32 0 

TTGTAG 7 326 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2499 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D> TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TCACTTGTCA ATTAATCCAG CTTCCTTAAT TTTACTGAAG AAGAATTTCT CCAGGATATT 60 
GGCACATTTG T AGT ATT CAC TCTCAGGGGC GTTGTACTCT TTGCAATTGG TAAAGACT CG,' 12 0 
CTGTAAGTCT GC CAT GAAT A ATTTCTTAGA CACGTAGTAC CTATTCTTGA GGCGTTCACT" 180 
CATGGTTTTG AGAT C CAT G G GGGACCTTAT AACTT CATAA TATCCTGGAG CTTCTGTTCT 240 
CTTCACAGGT T C CAT GAAGG GCCAAGCGCT TTGATGGCTC TTCACCTGCT GGAGGAT GCT 300 
CTTGAGCGTG CTGTAAAGCT GGTCAGGGTC TCTGGGCTCT TTACTTTTCT CTTTTCCACT 360 
CGGTTTCCAG CCTGTCTCTC TAATTCCAGG AATGCTTTCT AT AGGAAT CT GTCGAACTCC 42 0 

ATCTTTAAAA CAT GAAAGTC CAGGGTAAAC TTTTCGAATT TGTGCCTGTT TTCTTTCAAT 4 80 

CAGTTTTTTA ATTATCTCCT TCTGCTTTTT AATGATGACA GAAAATT CT G TGTACGGGAT 540 
CCGTGGATTT AGCTCACATC CCATTAAAGT GGCTCCTTCA TAATCCTTGA TATAGCCAAC 600 
ATATTTGGTT TTAGGTATTT TAATTTCTTT GGAGAAACCC TGTTTCTTAA AGTATCCAAT 660 
T GC AT AT T CA T CT GC AT AT G TGAGGAAGTT CAGGATGTCA TGCTTTATGT GATATTCTTT. 720 
CAAATGATTC ATCAGGTGTG TTCCATAGCC CTTGACTTGC TCATTTGAGG TTACAGCACA 780 
GAAGACAATC T CT GT GAAT C CTTGAGATGG GAACATACGG AAACAGATAC CACCAATAAC , 840 
ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GC C GTGT GAT 9 00 

GTATTCTTTT GGCATTCGGG GCAGCTGGTG GGAGAAAACG TTCTGTAGGC CAACCAGCCA 960 

CAT CAGGAT C TTCTTGTTTG GTTTCTGGTT GAGGGAATTG CCAACCACGT GAAATTCAAT 102 0 

TACACCCCTG CGCTCTTCCA ACCTTGCCGC CTCATCCCTG GCCGAGTGTG CT GACAGAAA 108 0 

ATTGGTCTCT GGT C C AAGC A TTGCTGCAGG GTCCGTGATG GTAGACATAA CCTCGTTGAT 114 0 

TAATT C CAT C GGAATATCCC CCATAACTCG GGGTTTCTTG GCCTCCTCCA GAACAT GAGA 12 0 0 

AT C AGT CAT T TTCCTCTTTT CTCCTGGGTT TGCCTCAAGT CCAGAAGAGG CTTTGCAGGC 12 60 

AGGACTGCTG CTCCCTGCGT TTGGCTGCTC AAGGGAAGAT GAGGTTGAAT T GT AT GAAAT 132 0 

TGTCCCAGCC ACAGGAGGTG GATTGATAAC TGTTTGGATG CCTAGCTGGC TGGTTCTGGA 1380 

AGAGGCTGAG AGAAAAT C CT GATCCCAGAT GGGAGAGTTT TGACTATATA CTTCTTCTTC 14 40 

TAGCATGGAC AGAAATTTTG GGAAAT GAGT GAGGATTAGA GTTCGTTTTT CAAGAGGCAG 150 0 

TTTATCTTTT TCCTGTCTTG CTTGTTCCAG GAGTTGTCGC CT CAT AAC AG TGAAGACCGA 1560 

GCGAAGCAAT GTTCTCCCAA AC AC CT GT GT GGTTTCGTAC CGAGGTAGAC TGTCGCAGAA 162 0 

CTGTGGCACG TT G C AG T AAC ACAGCCACCT TGTGTAGTTC TCTTTGTATC CAGAAATATC . 168 0 

ATCATTGGGA GATCGCAGTC TTCGTTGAGA TGGTGCCTCC AGATGCCAAT AGTTGATGCG 17 4 0 

GTTTAGGAAC ATTTTTGCCA ACTCAACTAT TGTTTGCCTT TCTTTTGCTG GCAGGTGACT 18 00 

AAATT T GT AC T GC AC AAAGT TATTCACACC CTGTTCAATG CTAGGTTTTT CAAATGGGGG 18 60 

TTTCTTTTCC AAAGAGC CTT CAACCACAGG TTTTCCTCTT T GTAAAAT AG ACTTTCTCAA 192 0 

GAGCTTAAAT AGATAGAAAT AAACTTGTTT GGTATCTGCA TCTTCTTCCT TGTGGACACA 19 80 

GGTAAAGAGA TATTCCACAT CCAATACTAT TCCCAGGAGT CTGTT CATTT CTTCCTCTGA 2 04 0 

CACATTCTCC AGGT GGGAAA CAT GAGC AGC TAGGGC AT GG CTACAACTCC GACAGGATTC 2100 

TGTTAGACTG ACAATTATTT GCTGCAGGTC GGCTCTGGGG GGAGTGGGTG AGGGGTTAGG 2160 

GTTTTTCCAG CCATTACATT T AC AAGACT C CTCGGCCTTG CAGGCGGAGT ACACTCCGAG 222 0 

TTTCTCCAGT TTCTTGGCCC, GCGGAGCGGA GCGTAGTTGC GCTTTCTTCA CGGCGATTCG 2280 

GGCCGAGCCA CCGCCTCCCG GTCCTTCGGC CGTGCCCGCT GCAGCCACTG CCGTCGCCGG 2 34 0 

ACCGCAGGCG CCCGAGCCCC CGGCGGCAGC GGCGCAGGGG GAGCCCTGCG GGGGCGCGGG 24 00 
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CGGAAGCGCC GCAGGCTGCG GGGGCAGCGC CCCGGGCCCG GCCCCTGCCC CGGCTCCTGC 24 60 
CCCGCAGCCG CCCGGCCCGG CCCCGCCAGC CTCGGACAT 24 9 9 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCACTTGTCA ATCAACCCTG CTTCCTTAAT TTTACTGAAG AAGAACTTCT CCAGGATGCT 60 

GGCGCATTTG TAGTACTCGC TCTCGGGAGG GTTGTACTCC TTGCAGTTGG TGAACACTCG 120 

TTGCAAGTCC GCCATGAATA ACTTCTTAGA CACATAGTAC CTGTTCCTGA GGCGTTCACT 180 

CATGGTTTTC AGAT C CAT GG GGAACCTTAT AACTTCATAA TATCCCGGAG CTTCTGTTCT 2 40 

CTTCACTGGT T C CAT GAAAG GCCAAGCATT TGGATGGTTC TTCACCTGCT GCAGGAT GTT 300 

CTTGAGGGTG CTGTAAACGT GCTCAGGGTC TTTGGGCTCT TTACTTTTCT CTTTTCCACT 3 60 

TGGTTTCCAG CCTGTCTCTC TGATTCCAGG AATGCTTTCT AT AGGAAT C T GCCGAACTCC 42 0 

ATCTTTGAAA C AC GAAAGT C CAGGGTAGAC TTTTCGAATC TGGGCTTGTT TTCTTTCTAT 4 80 

CAGCTTTTTA ATGATCTCCT TCTGCTTTTT AATGATGACA GAGAACTCTG TGTATGGGAT 540 

CTGAGGGTTC AGCTCACATC C CAT CAAAGT GGCCCCTTCA TAATCCTTGA TGTAGCCAAC 60 0 

ATATTTGGTT TTAGGT ATT T TGATTTCTTT GGAGAAACCC TGCTTCTTGA AATAGCCGAT 660 

GGCATACTCA TCTGCATATG TGAGGAAGTT GAGGATCTCG TGCTTTATGT GGTATTCTTT 72 0 

GAGATGGTTC AT CAGGT GGG TTCCATAGCC CTTGACTTGT TCATTTGAGG TTACTGCACA 7 80 

GAAAACAATC TCTGTGAATC CCTGGGATGG AAACATCCGG AAACAGATAC CACCAATGAC 84 0 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GCCGTGTGAT 900 

GTACTCTTTG GGCATTCTGG GCAGCTGGTG GGAAAACACA TTCTGGAGGC CCACGAGCCA 960 

CAT CAGGAT C TTCTTGTTTG GTTTCTGGTT CAGGGAGTTG CCCACCACGT GGAATTCAAT 1020 

GACACCCCTG CGTTCTTCCA GCCGTGCCGC CTCATCTCTG GCCGAATGGG CTGACAGAAA 1080 

ATTGGTCTCT GGTCCAAGCA TCCCTGCAGG GTCTGTGATG GTAGACATGA CCT CATTGAT 114 0 

CAATTCCACG GGAATATCCC CCATCACTCG AGATCTCTTG GCCTCCTCGG GAGC AT GAGA 12 00 

GTT GTT CAT T TTCCTCTTTT CTCCCGGGTT TGCTTCAAGC CCAGAAGAGC CTCTGCATCC 12 60 

AGGACTTGTT CTCCCTCCAT TGATCTGCTC ATGGGAAGTT GAATTT GAAC TGAACAATGC 132 0 

TGTCCCAGTA ACAGGAGGAC T GATT ACT GT TTGGATTCCT AGCGGGCTGG TTCTGGAAGA 138 0 

GGCT GAGAGA AAATCCTGAT C C C AGAT AGG AGAATTTTGA CTATACACTT CTTCTTCCAA 1440 

CAT GGAC AGA AACTTTGGGA AATGTGTGAG GATAAGCGTG CGTTTCTCAA GAGGCAGTTT 1500 

GTCTTTTTTC TGTCTGGCTT GTTCCAAGAG CTGTCGTCTC ATGATGGTGA AGACCGAGCG 1560 

AAGCAATGTT CTCCCAAACA CCTTTGTGGT TTCGTACCGA GGTAAGCTGT CACAGAACTG 162 0 

CGGTACATTG CAGTAGCACA ACCACCTTGT GTAGTTTTCC T T GT AT C C AG AGAT GT CAT C 168 0 

ATTGGGAGAC CGTAGTCTCC GCTGAGATGG AGCCTCCAGA TGCCAGTAGT TGATGCGGTT 174 0 

CAGAAACATC TTGGCCAGCT CGATCGTTGT CTGCCTCTCT TTCGATGGCA AGT GACTAAA 18 00 

CTTGTACTGC AC GAAGTT GT TCACACCCTG TTCAATACTG GGCTTCTCAA ATGGCGGCTT 1860 

CTTCTCCAAG GAGCCTT CAA CCACAGGTTT TCCTCTTTGT AAAATTGACT TTCTCAAGAG 192 0 

CTTGAATAGG TAGAAGTACA CTTGTTTGGT ATCTGCATCT TCTTCTTTGT GGACGCAGGT 1980 

GAAGAG GT AC TCCACATCCA ACACAATTCC CAGGAGTCTG TCCATCTCTT CCTCTGACAC 2 04 0 

ATTCTCCAAG T GAGAAAC GT GAGCAGCAAG GGCATGGCTA CAGCTTCGAC AGGATTCTGT 2100 

CAAACTGACA ATTATCTGCT GGAGGTCTCC TCTTGGTGGA GTAGGAGAGG GGTTAGGGTT 2160 

CTTCCAGCCA TTGCATTTAC AGGACTCCTC TGCCTTGCAG GCGGAGTACA CGCCGAGTTT 222 0 

CTCCAGCTTC TTCGCCCGCG GAGCAGAGCG CAACTGCGCC TTCTTCACGG CGATCCGGGC 228 0 

CGAGCCGCCT CCTCCCGGTC CCTCGGCGGT GCCCGCCGCG GCCACCGGCG TCGCTGGCCC 2340 

GCAGGAAGCA GAGCTCCCGG CAGCGGTGGC CAGGGTCCGG GGGGAACCGT GCGGGGGCGC 2 400 

GGGAGGCAGT GCTGGGGACC CGGCCCCGCC AGCCTCGGCC AT 2442 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CCCGCCAGCC TCGGACATGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
{B ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CCCGCCAGCC TCGGCCATGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATGGCCGAGG CTGGCGGGGC CGGGTCCCCA GCACTGCCTC CCGCGCCCCC GCACGGTTCC 6C 

CCCCGGACCC TGGCCACCGC TGCC GGGAGC TCTGCTTCCT GCGGGCCAGC GACGCCGGTG 120 

GCCGCGGCGG GCACCGCCGA GGGACCGGGA GGAGGCGGCT CGGCCCGGAT CGCCGTGAAG 180 

AAGGCGCAGT TGCGCTCTGC TCCGCGGGCG AAGAAGCTGG AGAAACTCGG CGTGTACTCC 240 

GCCTGCAAGG CAGAGGAGTC CTGTAAATGC AATGGCTGGA AGAACCCTAA CCCCTCTCCT 300 

ACTCCACCAA GAGGAGACCT CCAGCAGATA ATTGTCAGTT TGACAGAATC CTGTCGAAGC 360 

TGTAGCCATG CCCTTGCTGC TCACGTTTCT CACTTGGAGA ATGTGTCAGA GGAAGAGATG 42 0 

GACAGACTCC TGGGAATTGT GTTGGATGTG GAGTACCTCT TCACCTGCGT CCACAAAGAA ,4 8 0 

GAAGAT GCAG ATACCAAACA AGTGTACTTC TACCTATTCA AGCTCTT GAG AAAGTCAATT 540 

T T AC AAAGAG GAAAACCTGT GGTTGAAGGC TCCTTGGAGA AGAAGCCGCC ATTTGAGAAG 600 

CCCAGTATTG AACAGGGTGT GAACAACTTC GT GCAGT AC A AGTTTAGTCA CTTGCCATCG 660 

AAAGAGAGGC AGACAACGAT CGAGCTGGCC AAGATGTTTC TGAACCGCAT CAACTACTGG 72 0 

CATCTGGAGG CTCCATCTCA GCGGAGACTA CGGTCTCCCA AT GATGACAT CTCTGGATAC 780 

AAGGAAAACT AC ACAAGGT G GTTGTGCTAC T GCAAT GTAC CGCAGTTCTG TGACAGCTTA 8 40 

CCTCGGTACG AAAC C AC AAA GGTGTTTGGG AGAACATTGC TTCGCTCGGT CTTCACCATC 900 

AT GAGAC GAC AGCTCTTGGA ACAAGCCAGA CAGAAAAAAG ACAAACTGCC TCTTGAGAAA 960 

CGCACGCTTA TCCTCACACA TTTCCCAAAG TTTCTGTCCA TGTTGGAAGA AGAAGTGTAT 102 0 

AGT CAAAATT CTCCTATCTG GGAT CAGGAT TTTCTCTCAG CCTCTTCCAG AACCAGCCCG 108 0 

CTAGGAATCC AAACAGTAAT CAGTCCTCCT GTTACTGGGA CAGCATTGTT C AGTT CAAAT 114 0 

TCAACTTCCC AT GAGCAGAT CAATGGAGGG AGAACAAGTC CT GGAT GCAG AGGCTCTTCT 1200 

GGGCTT GAAG CAAACCCGGG AGAAAAGAGG AAAAT GAAC A ACTCTCATGC T C CC GAG GAG 12 60 

GCCAAGAGAT CTCGAGTGAT GGGGGATATT CCCGTGGAAT TGATCAATGA GGTCATGTCT 132 0 

AC CAT CAC AG ACCCTGCAGG GATGCTTGGA CCAGAGACCA ATTTTCTGTC AGCCCATTCG 1380 

GCCAGAGATG AGGCGGCACG GCTGGAAGAA CGCAGGGGTG TCATTGAATT CCACGTGGTG 1440 

GGCAACTCCC TGAACCAGAA AC CAAAC AAG AAGATCCTGA TGTGGCTCGT GGGCCTCCAG 1500 

AATGTGTTTT CCCACCAGCT GC C C AGAAT G CCCAAAGAGT AC AT CAC AC G GCTCGTCTTT 1560 

GACCCGAAAC ACAAAAC C CT TGCTTTAATT AAAGATGGCC GTGTCATTGG TGGTATCTGT 162 0 

TTCCGGATGT TTCCATCCCA GGGATTCACA GAGATTGTTT TCTGTGCAGT AAC CT CAAAT 1680 

• GAACAAGTCA AGGGCTATGG AACCCACCTG AT GAAC CAT C T C AAAGAAT A CCACATAAAG 17 40 

CAC GAGAT CC TCAACTTCCT CACAT AT GC A GAT GAGT AT G CCATCGGCTA TTTCAAGAAG 18 00 

CAGGGTTT CT CCAAAGAAAT C AAAAT AC C T AAAACCAAAT ATGTTGGCTA CATCAAGGAT 18 60 

TATGAAGGGG CCACTTTGAT GGGAT GT GAG CTGAACCCTC AGATCCCATA C AC AGAGTT C 1920 

TCTGTCATCA TTAAAAAGCA GAAGGAGATC ATTAAAAAGC T GAT AGAAAG AAAACAAGCC 1980 

CAGATTCGAA AAGTCTACCC TGGACTTTCG TGTTTCAAAG ATGGAGTTCG GCAGATTCCT 2 04 0 
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ATAGAAAGCA TTCCTGGAAT CAGAGAGACA GGCT GGAAAC CAAGT GGAAA AGAGAAAAGT 2100 

AAAGAGCCCA AAGACCCTGA GCACGTTTAC AGCACCCTCA AGAAC AT CCT GCAGCAGGTG 2160 

AAGAAC CAT C CAAATGCTTG GCCTTTCATG GAACCAGTGA AGAGAACAGA AGCTCCGGGA 2220 

TATTATGAAG TTATAAGGTT CCCCATGGAT CTGAAAACCA TGAGTGAACG CCTCAGGAAC 22 80 

AGGTACT AT G TGTCTAAGAA GTTATT CAT G GCGGACTTGC AACGAGTGTT CACCAACTGC 2 34 0 

AAGGAGTACA ACCCTCCCGA GAGC GAGT AC TACAAATGCG CCAGCATCCT GGAGAAGTTC 2 400 

TTCTTCAGTA AAATTAAGGA AGCAGGGTTG ATTGACAAGT GA 2 4 42 
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What is claimed is: 

1 . A purified protein designated P/C AF having a molecular weight of about 93,000 
daltons as determined by sodium dodecyl sulfate polyacrylamide gel electrophoresis 
under reducing conditions and which acetylates histones. 

2. The protein of claim 1 consisting of the amino acid sequence of SEQ ID NO: 1 
3 The protein of claim 1 comprising the amino acid sequence of SEQ ED NO:2 

4. The protein of claim 1, which also binds to the amino acid sequence of SEQ ID 
NO:3 on a p300 cellular protein and to amino acid residues 1805-1854 of a CBP cellular 
protein (SEQ ID NO:9). 

5. A fragment of the protein of claim 1 having histone acetyltransferase activity. 

6. A polypeptide consisting of the amino acid sequence of SEQ ID NO: 2 

7. A fragment of the protein of claim 1 which binds to the amino acid sequence of 
SEQ ID NO: 3 on the p300 cellular protein and the amino acid sequence of SEQ ID 
NO:9 on the CBP cellular protein. 

8. A polypeptide consisting of the amino acid sequence of SEQ ID N0 4. 

9. A nucleic acid consisting of the nucleotide sequence of SEQ ED NO: 10. 

10 A nucleic acid having a nucleotide sequence which encodes the protein of claim 

1. 



WO 98/03652 PCT/US97/12877 

100 

11 A nucleic acid having a nucleotide sequence which encodes the protein of claim 

2. 

12. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

3. 

13. A nucleic acid consisting of the nucleotide sequence which encodes the protein 
of claim 4. 

14. A nucleic acid complementary to and which selectively hybridizes with the 
nucleic acid of claim 1 1 under stringent hybridization conditions. 

15 . A fragment of the nucleic acid of claim 9, which encodes a polypeptide that 
acetylates histones. 

16. A fragment of the nucleic acid of claim 9, which encodes a polypeptide which 
binds to the amino acid sequence of SEQ ID NO:3 on the p300 cellular protein and the 
amino acid sequence of SEQ ID NO:9 on the CBP cellular protein. 

17. A purified antibody which specifically binds the protein of claim 1 

18. A purified antibody which specifically binds the protein of claim 2. 

19. A purified antibody which specifically binds the protein of claim 3 

20 A purified antibody which specifically binds the protein of claim 4. 

21 An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of P/CAF comprising: 

a) contacting the substance with a system in which histone acetylation by 
P/CAF can be determined, 
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b) determining the amount of histone acetylation by P/CAF in the 
presence of the substance; and 

c) comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of P/CAF. 

22. An assay for screening substances for the ability to inhibit binding of P/CAF to 
p300/CBP comprising: 

a) contacting the substance with a system in which the P/CAF binding of 
P300/CBP can be determined; 

b) determining the amount of P/CAF binding of p300/CBP in the presence of 
the substance; and 

c) comparing the amount of binding of P/CAF to p300/CBP in the presence of 
the substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 

23. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the p300 protein comprising amino acid residues 
1767-1816 (SEQ ID NO:3) and the protein of claim 4. 

24. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising amino acid residues 
1805-1854 (SEQ ED NO:9) and the protein of claim 4. 

25. The method of claim 22, wherein the system consists of a cell extract produced 
from cells producing both p300 and P/CAF. 
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26. An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of p300/CBP comprising: 

a) contacting the substance with a system in which histone acetylation by 

p300/CBP can be determined; 

b) determining the amount of histone acetylation by p300/CBP in the 

presence of the substance; and 

c) comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of p300/CBP 

27. An assay for screening substances for the ability to inhibit binding of a DNA- 
binding transcription factor to p300/CBP comprising: 

a) contacting the substance with a system in which the DNA-binding 
transcription factor binding of P300/CBP can be determined; 

b) determining the amount of DNA-binding transcription factor binding of 
p300/CBP in the presence of the substance; and 

c) comparing the amount of binding of DNA-binding transcription factor to 
p300/CBP in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 
binding of DNA-binding transcription factor to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

28. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a DNA-binding transcription factor and p300/CBP. 

29. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising a DNA-binding 
transcription factor and p300/CBP. 
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30. The method of claim 27, wherein the system consists of a cell extract produced 
from cells producing both a DNA-binding transcription factor and p300/CBP. 

3 1 . The method of claim 27, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YY1, Sap- la, c-Fos, MyoD and SRC-1. 

32. A method for inhibiting the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 

33 The method of claim 32, wherein the substance can inhibit the transcription 
modulating activity of P/CAF by preventing the binding of P/CAF to p300/CBP 

34. A method for stimulating the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
stimulating amount of a substance in a pharmaceutical^ acceptable carrier. 

35. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by promoting the binding of P/CAF to p300/CBP. 

36. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by stimulating the histone acetlytransferase activity of 
P/CAF. 

37. A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 



WO 98/03652 PCT/US97/ 12877 

104 

38. The method of claim 37, wherein the substance can inhibit the transcription 
modulating activity of p300/CBP by preventing the binding of a DNA-binding 
transcription factor to p300/CBP. 

39. The method of claim 38, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YY1, Sap- la, c-Fos, MyoD and SRC-1. 

40. The method of claim 37, wherein the substance is an antibody which binds 
p300/CBP. 

41 . A method for stimulating the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
stimulating amount of a substance in a pharmaceutical^ acceptable carrier. 

42. The method of claim 41, wherein the substance can stimulate the histone 
acetyltransferase activity of p300/CBP by promoting the binding of a DNA-binding 
transcription factor to p300/CBP. 



WO 98/03652 



PCT/US97/12877 




Fig.l 



WO 98/03652 



PCT/US97/ 12877 




Fig. X 



WO 98/03652 



PCT/US97/12877 




Fig. 3 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 . 

C12N 15/12, C07K 14/47, G01N 33/50, 
A61K 38/17 



A3 



(11) International Publication Number: WO 98/03652 

(43) International Publication Date: 29 January 1998 (29.01.98) 



(21) International Application Number: PCT/ US97/ 12877 

(22) International Filing Date: 23 July 1997 (23.07.97) 



(30) Priority Data: 

60/022,273 



23 July 1996 (23.07.96) 



US 



(60) Parent Application or Grant 

(63) Related by Continuation 
US 

Filed on 



60/022,273 (CIP) 
23 July 1996 (23.07.96) 



(71) Applicant (for all designated States except US): THE GOV- 

ERNMENT OF THE UNITED STATES OF AMERICA, 
represented by THE SECRETARY, DEPARTMENT OF 
HEALTH AND HUMAN SERVICES [US/US]; National 
Institutes of Health, Office of Technology Transfer Suite 
325, 601 1 Executive Boulevard, RockviUe, MD 20852 (US), 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): NAKATANI, Yoshihiro 
[JP/US], 4624 Edgefield Road, Bethesda, MD 20814 (US). 
HOWARD. Bruce, H. [US/US]; 8715 Fallen Oak Drive. 
Bethesda, MD 20817 (US). 



(74) Agents: MILLER. Mary, L. et aL; Needle & Rosenberg, Suite 
1200, 127 Peachtree Street, N.E., Atlanta, GA 3O303 (US). 



(81) Designated States: AL, AM, AT, AU, AZ, BA, BB, BG, BR 
BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB, GE* 
HU, IL, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS. LT 
LU, LV, MD, MG, MK, MN, MW, MX, NO, NZ PL PT 
RO, RU, SD, SE, SG f SI, SK, TJ, TM, TR. TT, Ua/uG.' 
US, UZ, VN, ARIPO patent (GH, KE, LS, MW, SD. Sz! 
UG, ZW), Eurasian patent (AM, AZ, BY, KG, KZ, MD, 
RU, TJ, TM), European patent (AT, BE, CH, DE DK ES 
FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE), OAPI patent 
(BF, BJ, CF, CG, CI, CM. GA, GN, ML, MR, NE, SN, TD 
TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the claims 
and to be republished in the event of the receipt of amendments. 

(88) Date of publication of the international search report: 

26 February 1998 (26.02.98) 



(54) Title: P300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF AND USES THEREOF 



(57) Abstract 

The present invention provides a purified protein 
designated P/CAF having a molecular weight of about 
93,000 daltons as determined by sodium dodecyl sul- 
fate poiyacrylamide gel electrophoresis under reducing 
conditions and which acetylates histones and which also 
binds to the p300/CBP cellular protein. The present 
invention further provides a nucleic acid encoding the 
P/CAF protein as well as a vector containing the nu- 
cleic acid and a host for the vector. A purified antibody 
which specifically binds the P/CAF protein is also pro- 
vided. Also provided are methods of screening for com- 
pounds that inhibit or stimulate the transcription mod- 
ulating and histone acetyltransferase activity of P/CAF 
and p300/CBP. 



5? f / 



/ / f / 



P300 - 



fMi> 
?!-0 



I ? 3 



L 



7 8 9 10 



It 



<f <F <f <o" <^ 



FOR THE PURPOSES OP INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MC 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


CN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Paso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Cenirat African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


Cole d* I voire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






cu 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







INTERNATIONAL SEARCH REPORT 



Inten nal Application No 

PCT/US 97/12877 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C12N15/12 CG7K14/47 601N33/50 A61K38/17 



According to International Patent Classifioation (IPC) or to both national classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (classification system followed by classification symbols) 

IPC 6 C07K 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted during the international search (name of data base and, where practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category * Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



EMBL EST, Accession numberN39522 
Sequence no yv27b08.sl Homo sapiens cDNA 
clone 243927 3' 
25 January 1996 
XP00205G4G2 

see the whole document 

GEORGAKOPOULOS, T. & THIREOS, G. : 
EMBO JOURNAL. , 

vol. 11, 1992, EYNSHAM, OXFORD GB, 
pages 4145-4152, XP002050399 
see the whole document 



m 



Further documents are listed in the continuation of box C. 



□ 



Patent family members are listed in annex. 



• Special categories of cited documents : 

"A" document defining the general state of the art which is not 
considered to be of particular relevance 

"E* e artier document but published on or after the international 
filing date 

"L* document which may throw doubts on priority claim(s) or 
which is otted to establish the publication date of another 
citation or other special reason (as specified} 

"O" document referring to an oral disclosure, use, exhibition or 
other means 

*P a document published prior to the international filing date but 
later than the priority date claimed 



T" later document published after the international filing date 
or priority date and not in conflict with the application but 
cited to understand the principle or theory underlying the 
invention 

"X" document of particular relevance, the claimed invention 
cannot be considered novel or oannot be considered to 
involve an inventive step when the document is taken alone 

*Y" document of particular relevanoe; the claimed invention 

cannot be considered to involve an inventive step when the 
dooument is combined with one or more other such docu- 
ment*, such combination being obvious to a person skilled 
in the art. 

*&" dooument member of the same patent family 



Date of the actual completion of the international search 



17 December 1997 



Oate of mailing of the international search report 



1 i 01 98 



Name and mailing address of the ISA 

European Patent Offioe. P 8 $818 Patenttaan 2 
NL * 2280 HV Rijswijk 
Tel. (+31-70) 340-2040, Tx 31 651 epo nl, 
Fax: (+31-70) 340-3016 



Authorized offioe r 



Chambonnet , F 



Form PCT71SA/210 (second tfteet) (Juty 1992) 



page 1 of 2 



INTERNATIONAL SEARCH REPORT 



Inten nal Application No 

PCT/US 97/12877 



C.(Contmuation) DOCUMENTS CONSIDERED TO BE RELEVANT 



Category ° Citation of dooument, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



YANG, X.Y. ET AL. : "A p30Q-CBP-associated 
factor that competes with the adenoviral 
oncoprotein E1A" 
NATURE. , 

vol. 382, no. 8589, 25 July 1996, LONDON 
GB, 

pages 319-324, XP002050400 
see the whole document 

OGRYZKO, V.V. ET AL.: "The 
transcriptional coactivators p300 and CBP 
are hi stone acetyl transferases" 
CELL, 

vol. 87, no. 5, November 1996, NA US, 
pages 953-959, XP002050401 
see the whole document 

EMBL EST, Accession number U57316, 
Sequence reference human GCN5 (hGCNS) 
complete cds. 
26 august 1996 
XP002050403 

see the whole document 



Form PCT/tSA/210 (continuation of second aheet) (July 1992) 



page 2 of 2 



INTERNATIONAL SEARCH REPORT 



International application No. 

PCT/US 97/12877 



Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 

This International Search Report has not been established in respect of certain claims under Article I7(2)(a) for the following reasons: 



1 | X I Claims Nos.: 

because they relate to subject matter not required to be searched by this Authority, namely: 

see FURTHER INFORMATION sheet PCT/ISA/210 



□ 



Claims Nos.: 

because they relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful International Search can be carried out, specifically: 



3. | | Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 

Box It Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 

This Intemationaf Searching Authority found multiple inventions in this international application, as follows: 



1 . I As all required additional search fees were timely paid by the applicant, this International Search Report covers 
' ' searchable claims. 



ail 



2. | | As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 
of any additional fee. 



3. I | As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
' * covers only those claims for which fees were paid, specifically claims Nos. : 



4. | | No required additional search fees were timely paid by the apoJicant Consequently, this International Search Report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos. : 



Remark on Protest 



| | The additional search fees were accompanied by the applicant's protest. 
| | No protest accompanied the payment of additional search fees. 



Form PCT/ISA/210 (continuation of first sheet (1)) (July 1992) 



International Application No. PCT/US 97/12877 



FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 2 10 



Remark : Although claims 32 to 42 are directed to a method of treatment 
of the human/animal body , the search has been carried out and based on 
the alleged effects of the composition. 



CORRECTED 
VERSION* 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 ; 

C12N 15/12, C07K 14/47, G01N 33/50, 
A61K 38/17 



A3 



(11) International Publication Number: WO 98/03652 

(43) International Publication Date: 29 January 1998 (29.0 L98) 



(21) International Application Number: PCT/US97/ 1 2877 

(22) International Filing Date: 23 July 1997 (23.07.97) 



(30) Priority Data: 

60/022,273 



23 July 1996 (23.07.96) 



US 



(60) Parent Application or Grant 

(63) Related by Continuation 
US 

Filed on 



60/022,273 (CIP) 
23 July 1996 (23.07.96) 



(71) Applicant (for all designated States except US): THE GOV- 

ERNMENT OF THE UNITED STATES OF AMERICA, 
represented by THE SECRETARY, DEPARTMENT OF 
HEALTH AND HUMAN SERVICES [US/US]; National 
Institutes of Health, Office of Technology Transfer Suite 
325, 601 1 Executive Boulevard, Rockville, MD 20852 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): NAKATANI, Yoshihiro 
[JP/US]; 4624 Edgefield Road, Bethesda, MD 20814 (US). 
HOWARD, Bruce, H. [US/US]; 8715 Fallen Oak Drive, 
Bethesda, MD 20817 (US). 



(74) Agents: MILLER, Mary, L. et a!.; Needle & Rosenberg, Suite 
1200, 127 Peachtree Street, N.E M Atlanta, GA 30303 (US). 



(81) Designated States: AL, AM, AT, AU, AZ, BA, BB, BG, BR, 
BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB, GE, 
HU, IL, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, 
LU, LV, MD, MG, MK, MN, MW, MX, NO, NZ, PL, PT, 
RO, RU, SD, SE, SG, SI, SK, TJ, TM, TR, TT, UA, UG, 
US, UZ, VN, ARIPO patent (GH, KE, LS, MW, SD, SZ, 
UG, ZW), Eurasian patent (AM, AZ, BY, KG, KZ, MD, 
RU, TJ, TM), European patent (AT, BE, CH, DE, DK, ES, 
FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE), OAPI patent 
(BF, BJ, CF, CG, CI, CM, GA, GN, ML, MR, NE, SN, TD, 
TG). 



Published 

With international search report. 
Before the expiration of the time limit for amending the claims 
and to be republished in the event of the receipt of amendments. 

(88) Date of publication of the international search report: 

26 February 1998 (26.02.98) 



(54) Title: P300/CBP- ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF AND USES THEREOF 
(57) Abstract 



The present invention provides a purified protein 
designated P/CAF having a molecular weight of about 
93,000 daltons as determined by sodium dodecyl sul- 
fate polyacrylamide gel electrophoresis under reducing 
conditions and which acetylates histones and which also 
binds to the p300/CBP cellular protein. The present 
invention further provides a nucleic acid encoding the 
P/CAF protein as well as a vector containing the nu- 
cleic acid and a host for the vector. A purified antibody 
which specifically binds the P/CAF protein is also pro- 
vided. Also provided are methods of screening for com- 
pounds that inhibit or stimulate the transcription mod- 
ulating and histone acetyltransferase activity of P/CAF 
and p300/CBP. 



- /// ft 



iff / 



4 5 6 



7 8 9 10 



IP a-P/CAF Centre! a- P/CAF Control 




1 2 3 -J 1-8 2 3 4 5 6 



♦(Referred to in PCT Gazette No. 17/1998, Section U) 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AiM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN' 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Nigf r 


VN 


Viec Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


Cdte d'lvoire 


KP 


Democratic People's 


NZ 


New 'Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







WO 98/03652 



PCT7US97/12877 



P300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF AND USES THEREOF 



BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

The present invention provides a transcriptional co-factor, p300/CBP-associated 
factor (P/CAF), which modulates transcription through binding to the cellular 
transcription co-factors p300 and CBP and through acetylation of histones. Also 
10 provided are methods for screening for the presence of P/CAF and for substances which 
alter the transcription modulating effect and growth regulatory activity of P/CAF. 



Background Art 

Cellular proteins p300 and CBP are global transcriptional coactivators that are 
1 5 involved in the regulation of various DNA-binding transcriptional factors (Janknecht and 
Hunter, 1996). Recently, p300 was found to be very closely related to CBP, a factor 
that binds selectively to the protein kinase A-phosphorylated form of CREB (3-5). 
Cellular factors p300 and CBP exhibit strong amino acid sequence similarity and share 
the capacity to bind both CREB and El A (6-8). Although neither p300 nor CBP by 
20 itself binds to DNA, each can be recruited to promoter elements via interaction with 
sequence-specific activators and functions to be a transcriptional adaptor. For 
simplicity, p300 and CBP will be termed p300/CBP in the context of discussing their 
shared functional properties. 

25 p300/CBP is a large protein consisting of over 2,400 amino acids, known to 

interact with a variety of DNA-binding transcriptional factors including nuclear hormone 
receptors (13,57), CREB (3,4, 7), c-Jun/v-Jun (9,11), YY1 (10), c-Myb/v-Myb (12,58), 
Sap-la (59), c-Fos (1 1) and MyoD (60). DNA-binding factors recruit p300/CBP not 
only by direct but also indirect interactions through cofactors; for example, nuclear 

30 hormone receptors recruit p300/CBP directly as well as through indirect interactions, via 
SRC-1, which stimulates transcription by binding to various nuclear hormone receptors 
(13,61). 
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The transforming proteins encoded by adenovirus and several other small DNA 
tumor viruses disturb host cell growth control by interacting with cellular factors that 
normally function to repress cell proliferation. One of the most intensively studied of 
these viral proteins, the product of the adenovirus El A gene, is itself sufficient for 
5 transformation (1). El A transforming activity resides in two distinct domains, the 
targets of which include p300/CBP and products of the retinoblastoma (RB) 
susceptibility gene family (1,2). Interactions of El A with p300/CBP and RB are 
thought to influence functionally distinct growth regulatory pathways, allowing the two 
domains to contribute additively to transformation (1). 

10 

The paradigm for how El A and functionally related viral proteins perturb cell 
growth regulation derives in large part from studies on their interactions with RB (1,2) 
The molecular function of El A is based on its capacity to interfere with cellular protein- 
protein interactions. Since both El A and various cellular targets bind to a site in RB 
1 5 termed the pocket domain (2), El A can competitively disrupt the complex formation 
between RB and its cellular targets. 

The second cellular factor implicated in El A-dependent transformation, p300, is 
believed to inhibit G0/G1 exit, to activate certain enhancers, and to stimulate 
20 differentiation (1,2). El A inhibits the p300/CBP-mediated transcriptional activation of 
many promoters (14). In one case that has been examined, the complex of p300 and 
YY1, El A inhibits transcription without disrupting the complex (10). 

The present invention provides a cellular protein designated P/CAF which binds 
25 to p300/CBP and plays an important role in both transcription and cell cycle regulation 
associated with a histone acetyltransferase activity. The present invention also provides 
a histone acetyltransferase activity in the p300/CBP cellular protein, thus providing 
targets for modulating transcription and cell cycle regulation in cells. 



30 
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SUMMARY OF THE INVENTION 

The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
5 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones and which also binds to the p300/CBP cellular protein. 

The present invention further provides a nucleic acid encoding the P/CAF 
protein as well as a vector containing the nucleic acid and a host for the vector. A 
10 purified antibody which specifically binds the P/CAF protein is also provided. 

In addition, also provided is a bioassay for screening substances for the ability to 
inhibit the transcription modulating activity of P/CAF and/or histone acetyltransferase 
activity, comprising contacting the substance with a system in which histone acetylation 

15 by P/CAF can be determined; determining the amount of histone acetylation by P/G AF 
in the presence of the substance; and comparing the amount of histone acetylation by 
P/CAF in the presence of the substance with the amount of histone acetylation by 
P/CAF in the absence of the substance, a decreased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit the 

20 transcription modulating activity and/or histone acetyltransferase activity of P/CAF. 

Furthermore, the present invention provides a bioassay for screening substances 
for the ability to inhibit the transcription modulating activity and/or histone 
acetyltransferase activity of P/CAF comprising contacting the substance with a system in 

25 which the p300 binding of P/CAF can be determined; determining the amount of p300 
binding of P/CAF in the presence of the substance; and comparing the amount of p300 
binding of P/CAF in the presence of the substance with the amount of p300 binding of 
P/CAF in the absence of the substance, a decreased amount of p300 binding of P/CAF in 
the presence of the substance indicating a substance that can inhibit the transcription 

30 modulating activity and/or histone acetyltransferase activity of P/CAF 
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Also provided is a method for determining the amount of P/CAF in a biological 
sample comprising contacting the biological sample with a polypeptide comprising the 
amino acid sequence of SEQ ID NO:3 under conditions whereby a P/CAF/p300 
complex can be formed; and determining the amount of the P/CAF/p300 complex, the 
5 amount of the complex indicating the amount of P/CAF in the sample. 

The present invention additionally provides a method for determining the amount 
of P/CAF in a biological sample comprising contacting the biological sample with an 
antibody which specifically binds P/CAF under conditions whereby a P/CAF/antibody 
10 complex can be formed; and determining the amount of the P/CAF/antibody complex, 
the amount of the complex indicating the amount of P/CAF in the sample. 

Also provided herein is an assay for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of P/CAF, comprising: contacting the 

15 substance with a system in which histone acetylation by P/CAF can be determined; 
determining the amount of histone acetylation by P/CAF in the presence of the 
substance; and comparing the amount of histone acetylation by P/CAF in the presence of 
the substance with the amount of histone acetylation by P/CAF in the absence of the 
substance, a decreased or increased amount of histone acetylation by P/CAF in the 

20 presence of the substance indicating a substance that can inhibit or stimulate, 
respectively, the histone acetyltransferase activity of P/CAF. 

The present invention further provides an assay for screening substances for the 
ability to inhibit binding of P/CAF to p300/CBP comprising: contacting the substance 

25 with a system in which the P/CAF binding of P300/CBP can be determined; determining 
the amount of P/CAF binding of p300/CBP in the presence of the substance; and 
comparing the amount of binding of P/CAF to p300/CBP in the presence of the 
substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/C AF to p300/CBP in the presence of the 

30 substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 
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In addition, an assay is provided for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of p300/CBP, comprising: contacting 
the substance with a system in which histone acetylation by p300/CBP can be 
5 determined; determining the amount of histone acetylation by p300/CBP in the presence 
of the substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
10 stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

Furthermore, the present invention provides an assay for screening substances 
for the ability to inhibit binding of a DNA-binding transcription factor to p300/CBP~ 
comprising: contacting the substance with a system in which the DNA-binding 

15 transcription factor binding of P300/CBP can be determined; determining the amount of 
DNA-binding transcription factor binding of p300/CBP in the presence of the substance, 
and comparing the amount of binding of DNA-binding transcription factor to p300/CBP 
in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 

20 binding of DNA-binding transcription factor to p300/CBP in the presence of the 

substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

A method is also provided for inhibiting the transcription modulating activity of 
25 P/CAF in a subject, comprising administering to the subject a transcription modulating 
activity inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 

Also provided in the present invention is a method for stimulating the 
transcription modulating activity of P/CAF in a subject, comprising administering to the 
30 subject a transcription modulating activity stimulating amount of a substance in a 
pharmaceutical^ acceptable carrier. 
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Furthermore, the present invention provides a method for inhibiting the histone 
acetyitransferase activity of p300/CBP in a subject, comprising administering to the 
subject a histone acetyitransferase activity inhibiting amount of a substance in a 
5 pharmaceutical^ acceptable carrier. 

Finally, the present invention additionally provides a method for stimulating the 
histone acetyitransferase activity of p300/CBP in a subject, comprising administering to 
the subject a histone acetyitransferase activity stimulating amount of a substance in a 
10 pharmaceutical^ acceptable carrier. 

BRIEF DESCRIPTION OF THE FIGURES 



Figs. 1A-B. Fig 1A: P/CAF-p300/CBP interaction in vivo. Cell extract was 
15 immunoprecipitated with rabbit anti-P/CAF (lanes 1, 4, and 7), rabbit anti-CBP (lanes 2 
and 5), and mouse anti-p300 (lane 9) antibodies. For controls, cell extract was 
precipitated with rabbit control IgG (lanes 3, 6, and 8) or mouse anti-HA monoclonal 
antibody (lane 10). The precipitates were analyzed by immunoblotting with anti-P/CAF 
(lanes 1-3), anti-CBP (lanes 4-6), and anti-p300 (lanes 7-10) antibodies. The positions 
20 of non-specific bands are indicated by asterisks. Fig. IB: El A inhibits the P/CAF-p300 
interaction in vivo. Osteosarcoma cells were transfected with either control vector 
(lanes 1 and 4) or El A- (lanes 2 and 5) or El AAN- (lanes 3 and 6) expression vectors. 
Extract from the transfected subpopulation was immunoprecipitated with anti-P/CAF 
(lanes 1-3) or control (lanes 4-6) IgG. The precipitates were analyzed by 
25 immunoblotting with anti-p300 and anti-P/CAF. 

Figs. 2A-F. P/CAF and El A mediate antagonistic effects on cell cycle 
progression. HeLa cells (ATCC accession number CCL 2) were transfected by 
electroporation with 7 fxg of P/CAF-expression plasmid and/or 3 jxg of the full-length or 
30 the N-terminally deleted (A2-36) El A 12S-expression plasmid as indicated in the figure. 
These plasmids were constructed by subcloning FLAG-P/CAF and El A cDNAs into 
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pCX (34) and pcDNAI (Invitrogen), respectively. All samples, in addition, contained 1 
fxg of sorting plasmid (pCMV-IL2R) (3 1 ) and carrier plasmid (pCX) to normalize the 
total amount of DNA to 1 1 /zg. After transfection, cells were incubated in Dulbecco's 
modified Eagle's medium with 10% fetal bovine calf serum for 12 hours and 
5 subsequently labeled in medium containing 10 jjM bromo-deoxyuridine (BrdU) for 30 
min. Subsequently, the transfected subpopulation was purified by magnetic affinity cell 
sorting and nuclei were analyzed by dual parameter flow cytometry as described (32). 
Histograms show percentages of cells in Gl and S phases. Abscissa values represent 
fluorescence intensity of bound anti-BrdU antibodies in log scale. 

10 

Fig. 3. Histone acetyltransferase activity of P/CAF. Activity of hGCN5 (lanes 1 
and 4) and P/CAF (lanes 2 and 5) that acetylates free histones (lanes 1-3) or histones in 
the nucleosome core particle (35) (lanes 4-6) was measured as described (36). Each 
reaction contains 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol 
15 of the histone octamer or the nucleosome core particle and 10 pmol of [l- 14 C]acetyl- 
CoA. Note that the histone octamer dissociates into dimers or tetramers under assay 
conditions. Acetylated histones were detected by autoradiography after separation by 
SDS-PAGE. The bands corresponding to acetylated histones H3 and H4 are indicated 
by arrows. 

20 

DETAILED DESCRIPTION OF THE INVENTION 

As used in the specification and in the claims, "a" can mean one or more, 
depending upon the context in which it is used. 

25 

P/CAF protein and fragments 

The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
30 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones. The P/CAF protein can also bind to the amino acid region of SEQ ED NO: 3 
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(amino acid (aa) residues 1753 - 1966) of the cellular transcriptional factor, p300 (which 
has the complete amino acid sequence of SEQ ID NO:6 and the nucleotide sequence of 
SEQ ID NO: 12), and the amino acid region of SEQ ID NO:6 (amino acid residues 1805 
- 1854) of the cellular transcriptional factor, CBP (which has the complete amino acid 
5 sequence of SEQ ID NO:7 and the nucleotide sequence of SEQ ID NO: 13). The 
P/CAF protein can be defined by any one or more of the typically used parameters. 
Examples of these parameters include, but are not limited to molecular weight 
(calculated or empirically determined), isoelectric focusing point, specific epitope(s), 
complete amino acid sequence, sequence of a specific region (e.g., N-terminus) of the 
10 amino acid sequence and the like. 

For example, The P/CAF protein can consist of the amino acid sequence of SEQ 
ID NO: 1 or the P/CAF protein can comprise the amino acid sequence of SEQ ID NO:2 
which represents the carboxy terminal end of the P/CAF protein and contains the histone 
15 acetyltransferase activity, or the amino acid sequence of SEQ ID NO: 4, which 

represents the amino terminal end of the P/CAF protein, containing the binding site for 
p300/CBP. Because the amino-terminal region is specific for P/CAF it can be used to 
define and identify P/CAF. 

20 As used herein, "purified" refers to a protein (polypeptide, peptide, etc.) that is 

sufficiently free of contaminants or cell components with which it normally occurs to 
distinguish it from the contaminants or other components of its natural environment. 
The purified protein need not be homogeneous, but must be sufficiently free of 
contaminants to be useful in a clinical or research setting, for example, in an assay for 

25 detecting antibodies to the protein. Greater levels of purity can be obtained using 
methods derived from well known protocols Specific methods for purifying P/CAF 
proteins are known in the art. 



30 



As will be appreciated by those skilled in the art, the invention also includes 
those P/CAF polypeptides having slight variations in amino acid sequence which yield 
polypeptides equivalent to the P/CAF protein defined herein. Such variations may arise 
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naturally as allelic variations (e.g., due to genetic polymorphism) or may be produced by 
human intervention (e.g., by mutagenesis of cloned DNA sequences), such as induced 
point, deletion, insertion and substitution mutants. Minor changes in amino acid 
sequence are generally preferred, such as conservative amino acid replacements, small 
5 internal deletions or insertions, and additions or deletions at the ends of the molecules. 
Substitutions may be designed based on, for example, the model of Dayhoff, etal (37). 
These modifications can result in changes in the amino acid sequence, provide silent 
mutations, modify a restriction site, or provide other specific mutations. 

1 0 Modifications to any of the P/C AF proteins or fragments can be made, while 

preserving the specificity and activity (function) of the native protein or fragment 
thereof. As used herein, "native" describes a protein that occurs in nature. The 
modifications contemplated herein can be conservative amino acid substitutions, for 
example, the substitution of a basic amino acid for a different basic amino acid. 

1 5 Modifications can also include creation of fusion proteins with epitope tags or known 
recombinant proteins or genes encoding them created by subcloning into commercial or 
non-commercial vectors (e.g., polyhistidine tags, flag tags, myc tag, glutathione-S- 
transferase [GST] fusion protein, xylE fusion reporter construct). Furthermore, the. 
modifications can be such as do not affect the function of the protein or the way the 

20 protein accomplishes that function (e.g., its secondary structure or the ultimate result of 
the protein's activity). These products are equivalent to the P/CAF protein. The means 
for determining the function, way and result parameters are well known. 

Having provided an example of a purified P/CAF protein, the invention also 
25 enables the purification of P/CAF homologs from other species and allelic variants from 
individuals within a species. For example, an antibody raised against the exemplary 
human P/CAF protein can be used routinely to screen preparations from different 
humans for allelic variants of the P/CAF protein that react with the P/CAF protein- 
specific antibody.: Similarly, an antibody raised against an epitope, for example, from a 
30 conserved amino acid region of the human P/CAF protein can be used to routinely 
screen for homologs of the P/CAF protein in other species. A P/CAF protein can be 
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routinely identified in and obtained from other species and from individuals within a 
species using the methods taught herein and others known in the art. For example, 
given the present sequence, the DNA encoding a conserved amino acid sequence can be 
used to probe genomic DNA or DNA libraries of an organism to predictably obtain the 
5 P/CAF gene for that organism. The gene can then be cloned and expressed as the 
P/CAF protein and purified according to any of a number of routine, predictable 
methods. An example of the routine protein purification methods available in the art can 
be found in Pei etal. (38). 

10 A purified polypeptide fragment of the P/CAF protein is also provided. The 

term "fragment" as used herein regarding a P/CAF protein, means a molecule of at least 
five contiguous amino acids of P/CAF protein that has at least one function shared by 
P/CAF protein or a region thereof. These functions can include antigenicity, binding 
capacity, acetyltransferase activity and structural roles, among others. The P/CAF 

15 fragment can be specific for a recited source. As used herein to describe an amino acid 
sequence (protein, polypeptide, peptide, etc.), "specific" means that the amino acid 
sequence is not found identically in any other source. The determination of specificity is 
made routine by the availability of computerized amino acid sequence databases and 
sequence comparison programs, wherein an amino acid sequence of almost any length 

20 can be quickly and reliably checked for the existence of identical sequences. If an 
identical sequence is not found, the protein is "specific" for the recited source. For 
example, a P/CAF fragment can be species-specific (e.g., found in the P/CAF protein of 
humans, but not of other species). 

25 A fragment of the P/CAF protein having histone acetyltransferase activity can 

consist of the amino acid sequence of SEQ ID NO:2. A fragment of the P/CAF protein 
which binds to the amino acid sequence of SEQ ID NO: 3 on p300 and the amino acid 
sequence of SEQ ED NO: 9 on CBP can consist of the amino acid sequence of SEQ ID 
NO:4. To the extent that these fragments are specific for P/CAF, they can be used to 

30 identify and define P/CAF. 
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An antigenic fragment of P/CAF protein is provided. An antigenic fragment has 
an amino acid sequence of at least about five consecutive amino acids of a P/CAF 
protein amino acid sequence and binds an antibody or elicits an immune response in an 
animal. An antigenic fragment can be selected by applying the routine technique of 
5 epitope mapping to P/CAF protein to determine the regions of the proteins that contain 
epitopes reactive with antibodies or are capable of eliciting an immune response in an 
animal. Once the epitope is selected, an antigenic polypeptide containing the epitope 
can be synthesized directly, or produced recombinantly by cloning nucleic acids 
encoding the antigenic polypeptide in an expression system, according to standard 
10 methods. 

Alternatively, an antigenic fragment of the antigen can be isolated from the 
whole P/CAF protein or a larger fragment of the P/CAF protein by chemical or 
mechanical disruption. Fragments can also be randomly chosen from a known P/CAF 
15 protein sequence and synthesized. The purified fragments thus obtained can be tested to 
determine their antigenicity and specificity by routine methods. 

Nucleic Acids Encoding P/CAF Protein , 

An isolated nucleic acid that encodes a P/CAF protein is also provided. As used 
20 herein, the term "isolated" means a nucleic acid separated or substantially free from at 
least some of the other components of the naturally occurring organism, for example, 
the cell structural components commonly found associated with nucleic acids in a 
cellular environment and/or other nucleic acids. The isolation of nucleic acids can 
therefore be accomplished by techniques such as cell lysis followed by phenol plus 
25 chloroform extraction, followed by ethanol precipitation of the nucleic acids (39). It is 
not contemplated that the isolated nucleic acids are necessarily totally free of all non- 
nucleic acid components or all other nucleic acids, but that the isolated nucleic acids are 
isolated to a degree of purification to be useful in clinical, diagnostic, experimental, or 
other procedures such as, for example, gel electrophoresis, Southern, Northern or dot 
30 blot hybridization, or polymerase chain reaction (PCR). 
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A skilled artisan in the field will readily appreciate that there are a multitude of 
procedures which may be used to isolate the nucleic acids prior to their use in other 
procedures. These include, but are not limited to, lysis of the cell followed by gel 
filtration or anion exchange chromatography, binding DNA to silica in the form of glass 
5 beads, filters or diatoms in the presence of high concentrations of chaotropic salts, or 
ethanol precipitation of the nucleic acids. 

The nucleic acids of the present invention can include positive and negative 
strand RNA as well as DNA and can include genomic and subgenomic nucleic acids 

10 found in the naturally occurring organism. The nucleic acids contemplated by the 
present invention include double stranded and single stranded DNA of the genome, 
complementary positive stranded cRNA and mRNA, and complementary cDNA 
produced therefrom and any nucleic acid which can selectively or specifically hybridize 
to the isolated nucleic acids provided herein. Stringent conditions (further described 

15 below) are used to distinguish selectively or specifically hybridizing nucleic acids from 
non-selectively and non-specifically hybridizing nucleic acids. 

An isolated nucleic acid that encodes a P/CAF protein can be species-specific 
(i.e., does not encode the P/CAF protein of other species and does not occur in other 
20 species); Examples of the nucleic acids contemplated herein include the nucleic acid of 
SEQ ED NO: 10 as well as the nucleic acids that encode each of the P/CAF proteins or 
fragments thereof described herein. P/CAF proteins and protein fragments can be 
routinely obtained as described herein and their structure (sequence) determined by 
routine means including the methods as used herein. 

25 

P/CAF protein-encoding nucleic acids can be isolated from an organism in which 
they are normally found (e.g., humans), using any of the routine techniques. For 
example, a genomic DNA or cDNA library can be constructed and screened for the 
presence of the nucleic acid of interest using one of the present P/CAF protein-encoding 
30 nucleic acids as a probe. Methods of constructing and screening such libraries are well 
known in the art and kits for performing the construction and screening steps are 
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commercially available (for example, Stratagene Cloning Systems, La Jolla, CA). Once 
isolated, the nucleic acid can be directly cloned into an appropriate vector, or if 
necessary, be modified to facilitate the subsequent cloning steps. Such modification 
steps are routine, an example of which is the addition of oligonucleotide linkers, which 
5 contain restriction sites, to the termini of the nucleic acid (See, for example, ref 39). 

P/CAF protein-encoding nucleic acids can also be synthesized. For example, a 
method of obtaining a DNA molecule encoding a specific P/CAF protein is to synthesize 
a recombinant DNA molecule which encodes the P/CAF protein. For example, nucleic 

10 acid synthesis procedures are routine in the art and oligonucleotides coding for a 

particular protein region are readily obtainable through automated DNA synthesis. A 
nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such'that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 

15 5' or 3' overhangs at the termini for cloning into an appropriate vector. T * 

Oligonucleotides complementary to or identical with the P/CAF protein- 
encoding nucleic acid sequence can be synthesized as primers for amplification 
reactions, such as PCR, or as probes to detect P/CAF protein encoding nucleic acids by 
20 various hybridization protocols (e.g., Northern blot; Southern blot; dot blot, colony 
screening, etc.). 



Double-stranded molecules coding for relatively large proteins can readily be 
synthesized by first constructing several different double-stranded molecules that code 

25 for particular regions of the protein, followed by ligating these DNA molecules together. 
For example, Cunningham, et al. (40), have constructed a synthetic gene encoding the 
human growth hormone by first constructing overlapping and complementary synthetic 
oligonucleotides and ligating these fragments together. See also, Ferretti, et al. (41), 
wherein synthesis of a 1057 base pair synthetic bovine rhodopsin gene from synthetic 

30 oligonucleotides is disclosed. 
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By constructing a P/CAF protein-encoding nucleic acid in this manner, one 
skilled in the art can readily obtain any particular P/CAF protein with modifications at 
any particular position or positions. See also, U.S. Patent No. 5,503,995 which 
describes an enzyme template reaction method of making synthetic genes. Techniques 
5 . such as this are routine in the art and are well documented. DNA encoding the P/CAF 
protein or P/CAF protein fragments can then be expressed in vivo or in vitro. 

The nucleic acid encoding the P/CAF protein can be any nucleic acid that 
functionally encodes the P/CAF protein. To functionally encode the protein (i.e., allow 

10 the nucleic acid to be expressed), the nucleic acid can include, but is not limited to, 
expression control sequences, such as an origin of replication, a promoter, regions 
upstream or downstream of the promoter, such as enhancers that may regulate the 
transcriptional activity of the promoter, appropriate restriction sites to facilitate cloning 
of inserts adjacent to the promoter, antibiotic resistance genes or other markers which 

15 can serve to select for cells containing the vector or the vector containing the insert, and 
necessary information processing sites, such as ribosome binding sites, RNA splice sites, 
polyadenylation sites and transcription termination sequences as well as any other 
sequence which may facilitate the expression of the inserted nucleic acid. 

20 Preferred expression control sequences are promoters derived from 

metallothionine genes, actin genes, immunoglobulin genes, CMV, SV40, adenovirus, 
bovine papilloma virus, etc. A nucleic acid encoding a P/CAF protein can readily be 
determined based upon the genetic code for the amino acid sequence of the P/CAF 
protein and many nucleic acid sequences will encode a P/CAF protein. Modifications in 

25 the nucleic acid sequence encoding the P/CAF protein are also contemplated. 
Modifications that can be useful are modifications to the sequences controlling 
expression of the P/CAF protein to make production of P/CAF protein inducible or 
repressible as controlled by the appropriate inducer or repressor. Such means are 
standard in the art {see, e.g., ref. 39). The nucleic acids can be generated by means 

30 standard in the art, such as by recombinant nucleic acid techniques, as exemplified in the 
examples herein, and by synthetic nucleic acid synthesis or in vitro enzymatic synthesis. 
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After a nucleic acid encoding a particular P/CAF protein of interest, or a region 
of that nucleic acid, is constructed, modified, or isolated, that nucleic acid can then be 
cloned into an appropriate vector, which can direct the in vivo or in vitro synthesis of 
that wild-type and/or modified P/CAF protein. The vector is contemplated to have the 
5 necessary functional elements that direct and regulate transcription of the inserted 
nucleic acid, as described above. The vector containing the P/CAF nucleic acid or 
nucleic acid fragment can be in a host (e.g., cell or transgenic animal) for expressing the 
nucleic acid. The P/CAF protein or fragment thereof can thus be produced in a host 
system containing the expression vector and its functional activity as described herein 
10 can be demonstrated according to methods well known in the art. 

There are numerous E. coli {Escherichia coli) expression vectors known to one 
of ordinary skill in the art useful for the expression of proteins. Other microbial hosts 
suitable for use include bacilli, such as Bacillus subtilis, and other enterobacteria, such 

15 as Salmonella, Serratia, as well as various Pseudomonas species. These prokaryotic 
hosts can support expression vectors which will typically contain expression control 
sequences compatible with the host cell (e.g., an origin of replication). In addition, any 
number of a variety of well-known promoters will be present, such as the lactose :.\ 
promoter system, a tryptophan (Trp) promoter system, a beta-lactamase promoter 

20 system, or a promoter system from phage lambda. The promoters will typically control 
expression, optionally with an operator sequence and have ribosome binding site 
sequences, for example, for initiating and completing transcription and translation. If 
necessary, an amino terminal methionine can be provided by insertion of a Met codon 5 1 
and in-frame with the gene sequence. Also, the carboxy-terminal extension of the 

25 protein can be removed using standard oligonucleotide mutagenesis procedures. 

Additionally, yeast expression can be used There are several advantages to 
yeast expression systems. First, evidence exists that proteins produced in yeast secretion 
systems exhibit correct disulfide pairing. Second, post-translational glycosylation is 
30 efficiently carried out by yeast secretory systems. The Saccharomyces cerevisiae pre- 
pro-alpha-factor leader region (encoded by the MFa-1 gene) is routinely used to direct 
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protein secretion from yeast (42). The leader region of pre-pro-alpha-factor contains a 
signal peptide and a pro-segment which includes a recognition sequence for a yeast 
protease encoded by the KEX2 gene. This enzyme cleaves the precursor protein on the 
carboxyl side of a Lys-Arg dipeptide cleavage- signal sequence. The polypeptide coding 
5 sequence can be fused in-frame to the pre-pro-alpha-factor leader region. This construct 
is then put under the control of a strong transcription promoter, such as the alcohol 
dehydrogenase I promoter or a glycolytic promoter. The protein coding sequence is 
followed by a translation termination codon which is followed by transcription 
termination signals. Alternatively, the polypeptide encoding sequence of interest can be 
10 fused to a second protein coding sequence, such as Sj26 or P-galactosidase, used to 
facilitate purification of the resultant fusion protein by affinity chromatography. The 
insertion of protease cleavage sites to separate the components of the fusion protein is 
applicable to constructs used for expression in yeast. 

15 Efficient post-translational glycosylation and expression of recombinant proteins 

can also be achieved in Baculovirus expression systems in insect cells. 

Mammalian cells permit the expression of proteins in an environment that favors 
important post-translational modifications such as folding and cysteine pairing, addition 

20 of complex carbohydrate structures and secretion of active protein. Vectors useful for 
the expression of proteins in mammalian cells are characterized by insertion of the 
protein encoding sequence between a strong viral promoter and a polyadenylation 
signal. The vectors can contain genes conferring either gentamicin or methotrexate 
resistance for use as selectable markers. For example, the antigen and immunoreactive 

25 fragment coding sequence can be introduced into a Chinese hamster ovary (CHO) cell 
line using a methotrexate resistance-encoding vector. Presence of the vector RNA in 
transformed cells can be confirmed by Northern blot analysis and production of a cDNA 
or opposite strand RNA corresponding to the protein encoding sequence can be 
confirmed by Southern and Northern blot analysis, respectively. A number of other 

30 suitable host cell lines capable of secreting intact proteins have been developed in the art 
and include the CHO cell lines, HeLa cells, myeloma cell lines, Jurkat cells, and the like. 
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Expression vectors for these cells can include expression control sequences, as described 
above. The vectors containing the nucleic acid sequences of interest can be transferred 
into the host cell by well-known methods, which vary depending on the type of cell host. 
For example, calcium chloride transfection is commonly utilized for prokaryotic cells, 
5 whereas calcium phosphate treatment or electroporation may be used for other cell 
hosts. 

Alternative vectors for the expression of protein in mammalian cells, similar to 
those developed for the expression of human gamma-interferon, tissue plasminogen 
10 activator, clotting Factor VIII, hepatitis B virus surface antigen, protease Nexin 1, and 
eosinophil major basic protein, can be employed. Further, the vector can include CMV 
promoter sequences and a polyadenylation signal available for expression of inserted 
nucleic acid in mammalian cells (such as COS7). 

1 5 The nucleic acid sequences can be expressed in hosts after the sequences have 

been positioned to ensure the functioning of an expression control sequence. These 
expression vectors are typically replicable in the host organisms either as episomes or as 
an integral part of the host chromosomal DNA. Commonly, expression vectors can : 
contain selection markers, e.g., tetracycline resistance or hygromycin resistance, to— 

20 permit detection and/or selection of those cells transformed with the desired nucleic acid 
sequences (see, e.g., U.S. Patent 4,704,362). 

The nucleic acids produced as described above can also be expressed in a host 
which is a non-human animal to create a transgenic animal, containing, in a germ or 

25 somatic cell, a nucleic acid comprising the coding sequence for all or a portion of the 
P/CAF protein, as well as all of the other regulatory elements required for expression of 
the P/CAF protein-encoding sequence. The animal will express the P/CAF gene or 
portion thereof to produce the P/CAF protein or protein fragment and such expression 
can be detected by determination of a particular phenotype unique to the transgenic 

30 animal expressing the transferred nucleic acid. 
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The nucleic acid can be the nucleic acid of SEQ ID NO: 10, a nucleic acid having 
a nucleotide sequence which encodes the P/CAF protein, a nucleic acid having a 
nucleotide sequence which encodes the protein of SEQ ID NO: 1, as well as the nucleic 
acids that encode the proteins comprising the fragments of SEQ ID NOS:2 and 4. 

5 

The nucleic acids of the invention can contain substitutions or deletions which 
provide a particular phenotype of interest. For example, various deletions or base 
substitutions can be introduced into the nucleic acid encoding the P/CAF protein for the 
purpose of studying the effects of these particular deletions or substitutions on the 

10 transcription modulation activity of the P/CAF protein. These effects can be monitored 
by observation of such characteristics as growth and development of the animal, the 
ability to develop tumors, survival rates and the like. The gene construct introduced 
into the animal cells to produce the transgenic animal can contain any of the regulatory 
elements described above to modulate expression of the foreign genes. As used herein, 

15 the term "phenotype" includes morphology, biochemical profiles, changes in tumor 

formation and other parameters that are affected by the presence of the P/CAF protein. 

The transgenic animals of the invention can also be used in a method for 
determining the effectiveness of administering a nucleic acid encoding a functional 

20 P/CAF protein to a subject in need of a functional P/CAF protein. First, a nucleic acid 
encoding a nonfunctional P/CAF protein can be introduced into the animal's cells and 
expressed to yield a characteristic phenotype. Then, using standard gene therapy 
techniques, a nucleic acid encoding a functional P/CAF protein can be introduced into 
the animal's cells and the effects on the animal s phenotypic characteristics can be 

25 determined. 

Having provided and taught how to obtain a nucleic acid that encodes a P/CAF 
protein, an isolated nucleic acid that encodes a fragment of P/CAF protein is also 
provided. The nucleic acid encoding the fragment can be obtained using any of the 
30 methods applicable to the nucleic acid encoding the entire P/CAF protein. The nucleic 
acid fragment can encode a species-specific P/CAF protein fragment (e.g., found in the 
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P/CAF protein of humans, but not in the P/CAF proteins of other species). Nucleic 
acids encoding species-specific fragments of P/CAF protein are themselves species- 
specific or allele-specific fragments of the P/CAF gene. 

5 Examples of fragments of a nucleic acid encoding a fragment of the P/CAF 

protein can include the nucleic acid sequences which encode the amino acid sequences 
of the fragments of SEQ ID NOS:2 or 4. The same routine computer analyses used to 
select these examples of fragments can be routinely used to obtain others. Fragments of 
P/CAF-encoding nucleic acids can be primers for PCR or probes, which can be species- 
10 specific, gene-specific or allele-specific. P/CAF-encoding nucleic acid fragments can 
encode antigenic or immunogenic fragments of P/CAF protein that can be used in 
therapeutic assays or screening protocols. P/CAF gene fragments can encode fragments 
of P/CAF protein having histone acetylase activity and/or p300/CBP binding activity as 
described above, as well as other uses that may become apparent. 

15 

An isolated nucleic acid of at least ten nucleotides that selectively hybridizes with 
the nucleic acid of SEQ ID NO: 10 under selected conditions is provided. For example, 
the conditions can be PCR amplification conditions and the hybridizing nucleic acid-can 
be a primer consisting of a specific fragment of the reference sequence or a nearly -r 
20 identical nucleic acid that hybridizes only to the exemplified P/CAF-encoding nucleic 
acid or allelic variants thereof. 

The invention provides an isolated nucleic acid that selectively hybridizes with 
the P/CAF-encoding nucleic acid sequence of SEQ ID NO: 10 under stringent 

25 conditions. The hybridizing nucleic acid can be a probe that hybridizes only to the 

exemplified P/CAF-encoding nucleic acid sequence. Thus, the hybridizing nucleic acid 
can be a naturally occurring species-specific allelic variant of the exemplified P/CAF 
gene. The hybridizing nucleic acid can also include insubstantial base substitutions that 
do not prevent hybridization under the stated stringent conditions or affect either the 

30 function of the encoded protein, the way the protein accomplishes that function (e.g., its 
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secondary structure) or the ultimate result of the protein's activity. The means for 
determining these parameters are well known. 

As used herein to describe nucleic acids, the term "selectively hybridizes" 
5 excludes the occasional randomly hybridizing nucleic acids as well as nucleic acids that 
encode other known homologs of the P/CAF protein. The selectively hybridizing 
nucleic acids of the invention can have at least 70%, 73%, 78%, 80%, 85%, 88%, 90%, 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% complementarity with the 
segment and strand of the sequence to which it hybridizes. This list is not intended to 

10 exclude percent complementarity values between these values. The nucleic acids can be 
at least 10, 15, 16, 17, 18, 20, 21, 23, 24, 25, 30, 35, 40, 50, 100, 150, 200, 300, 500, 
550, 750, 900, 950, or 1000 nucleotides in length or any intervening length, depending 
on whether the nucleic acid is to be used as a primer, probe or for protein expression. 
The hybridizing nucleic acid can comprise a region of at least ten nucleotides (up to full 

15 length) that is completely complementary to a unique region of the nucleic acid to which 
it hybridizes. 

The nucleic acid can be an alternative coding sequence for the P/CAF protein, or 
can be used as a probe or primer for detecting the presence of or obtaining the P/CAF 
20 protein. If used as primers, the invention provides compositions including at least two 
nucleic acids which selectively hybridize with different regions of the nucleic acid so as 
to amplify a desired region. Depending on the length of the probe or primer, it can 
range between 70% complementary bases and full complementarity and still hybridize 
under stringent conditions. 

25 

For example, for the purpose of obtaining or determining the presence of a 
nucleic acid encoding the P/CAF protein, the degree of complementarity between the 
hybridizing nucleic acid (probe or primer) and the sequence to which it hybridizes 
(P/CAF DNA in a sample) should be at least enough to exclude hybridization with a 
30 nucleic acid from another species. The invention provides examples of these nucleic 

acids of P/CAF, so that the degree of complementarity required to distinguish selectively 
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hybridizing from nonselectively hybridizing nucleic acids under stringent conditions can 
be clearly determined for each nucleic acid. It should also be clear that the hybridizing 
nucleic acids of the invention will not hybridize with nucleic acids encoding unrelated 
proteins (hybridization is selective) under stringent conditions. 

5 

"Stringent conditions" refers to the washing conditions used in a hybridization 
protocol. In general, the washing conditions should be a combination of temperature 
and salt concentration chosen so that the denaturation temperature is approximately 5- 
20 °C below the calculated T m of the nucleic acid hybrid under study. The temperature 

10 and salt conditions are readily determined empirically in preliminary experiments in 
which samples of reference DNA immobilized on filters are hybridized to the probe or 
protein encoding nucleic acid of interest and then washed under conditions of different 
stringencies. For example, the nucleic acid sequence of SEQ ED NO: 10 was used as a 
specific radiolabeled probe for the detection of messenger RNA transcribed from the 

1 5 P/C AF gene by performing hybridizations under stringent conditions. The T m of such an 
oligonucleotide can be estimated by allowing 2°C for each A or T nucleotide, and 4°C 
for each G or C. For example, an 18 nucleotide probe of 50% G+C would, therefore, 
have an approximate T m of 54° C. :i 

20 The invention provides an isolated nucleic acid that selectively hybridizes with 

the P/CAF gene shown in the sequence set forth as SEQ ID NO: 10 under stringent 
conditions. The invention further provides an isolated nucleic acid complementary to 
the nucleotide sequence set forth in SEQ ID NO: 10. 

25 Antibodies to the P/CAF protein 

A purified antibody and an antiserum containing polyclonal antibodies that 
specifically bind the P/CAF protein or antigenic fragment are also provided. The term 
"bind" means the well understood antigen/antibody binding as well as other nonrandom 
association with an antigen. "Specifically bind" as used herein describes an antibody or 
30 other ligand that does not cross react substantially with any antigen other than the one 
specified, in this case, an antigen of the P/CAF protein. Antibodies can be made as 
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described in Harlow and Lane (33). Briefly, purified P/CAF protein or an antigenic 
fragment thereof can be injected into an animal in an amount and in intervals sufficient to 
elicit a humoral immune response. Serum polyclonal antibodies can be purified directly, 
or spleen cells from the animal can be fused with an immortal cell line and screened for 
5 monoclonal antibody secretion, according to procedures well known in the art. Purified 
monospecific polyclonal antibodies that specifically bind the P/CAF antigen are also 
within the scope of the present invention. The antibodies of the present invention can 
bind the protein of claim 1, the protein of claim 2, the protein of claim 3 and/or the 
protein of claim 4, as well as any other proteins of the present invention. 

10 

A ligand that specifically binds the antigen is also contemplated. The ligand can 
be a fragment of an antibody, such as , for example, an Fab fragment which retains 
P/CAF binding activity, or a smaller molecule designed to bind an epitope of the P/CAF 
antigen. The antibody or ligand can be bound to a substrate or labeled with a detectable 
1 5 moiety or both bound and labeled. The detectable moieties contemplated within the 

compositions of the present invention include those listed above in the description of the 
diagnostic methods, including fluorescent, enzymatic and radioactive markers. 



The antibody can be bound to a solid support substrate or conjugated with a 
20 detectable moiety or therapeutic compound or both bound and conjugated. Such 
conjugation techniques are well known in the art. For example, conjugation of 
fluorescent, radioactive or enzymatic moieties can be performed as described in the art 
(33,43). The detectable moieties contemplated in the present invention can include 
fluorescent, radioactive and enzymatic markers and the like. Therapeutic drugs 
25 contemplated with the present invention can include cytotoxic moieties such as ricin A 
chain, diphtheria toxin, pseudomonas exotoxin and other chemotherapeutic compounds. 

It is well understood by one of skill in the art that all of the above discussion 
regarding antibodies to P/CAF can also be applied with regard to production, 
30 characterization and use of antibodies which bind the p300/CBP protein or any of the 
DNA-binding transcription factors of this invention. 
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Measuring the P/CAF protein in a sample 

The present invention also provides a method for determining the presence and 
thus the amount of P/CAF protein in a biological sample. As used herein, a biological 
sample includes any tissue or cell which would contain the P/CAF protein. Examples of 
5 cells include tissues taken from surgical biopsies, isolated from a body fluid or prepared 
in an in vitro tissue culture environment. 

One example of determining the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

10 sequence of SEQ ID NO: 3 under conditions whereby a P/CAF/p300 complex can be 
formed; and determining the amount of the P/CAF/p300 complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/p300 complex can be accomplished through techniques standard in the art- 
For example, the complex may be precipitated out of a solution and detected by the 

15 addition of a detectable moiety conjugated to the p300 protein or by the detection of an 
antibody which binds p300 or the P/CAF protein, as taught in the Examples herein. 
Antibodies which bind p300 or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of 
P/CAF/p300 complexes by the detection of the binding of antibodies reactive with p300 

20 or the P/CAF protein can be accomplished using various immunoassays as are available 
in the art, as described below. 

Alternatively, determination of the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

25 sequence of SEQ ID NO:9 under conditions whereby a P/CAF/CBP complex can be 
formed; and determining the amount of the P/CAF/CBP complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/CBP complex can be accomplished through techniques standard in the art. 
For example, the complex may be precipitated out of a solution and detected by the 

30 addition of a detectable moiety conjugated to the CBP protein or by the detection of an 
antibody which binds either CBP or the P/CAF protein, as taught in the Examples 
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herein. Antibodies which bind CBP or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of P/CAF/CBP 
complexes by the detection of the binding of antibodies reactive with CBP or the P/CAF 
protein can be accomplished using various immunoassays as are available in the art, as 
5 described below. 



Another example of determining the amount of P/CAF in a biological sample 
comprises contacting the biological sample with an antibody which specifically binds 
P/CAF under conditions whereby a P/CAF/ antibody complex can be formed and 
10 determining the amount of the P/CAF/antibody complex, the amount of the complex 
indicating the amount of P/CAF in the sample. Antibodies which bind P/CAF can be 
either monoclonal or polyclonal antibodies and can be obtained as described herein. 
Determination of P/CAF/antibody complexes can be accomplished using various 
immunoassays as are available in the art, as described below. 

15 

Immunoassays such as immunofluorescence assays, radioimmunoassays (RIA), 
immunoblotting and enzyme linked immunosorbent assays (ELISA) can be readily 
adapted for detection and measurement of P/CAF in a biological sample. Both 
polyclonal and monoclonal antibodies can be used in the assays. Available 
20 immunoassays are well known in the art and are extensively described in the patent 
scientific literature. See, for example, U.S. Patent Nos. 3,791,932; 3,839,153; 
3,850,752; 3,850,578; 3,853,987; 3,867,517; 3,879,262; 3,901,654; 3,935,074; 
3,984,533; 3,996,345; 4,034,074; and 4,098,876. 

25 Screening assays for P/CAF 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of P/CAF comprising contacting a 
system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 

30 amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
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amount of histone acetylation by P/CAF in the absence of the substance, a decreased 
amount of histone acetylation by P/CAF in the presence of the substance indicating a 
substance that can inhibit the histone acetyltransferase activity of P/CAF. The 
acetylation of histones by P/CAF can be determined in a system including, for example, 
5 either core histones (histones H2A, H2B, H3 and H4) or the nucleosome core particles 
(146 base pairs of DNA wrapped around the octamer of core histones) as substrates, the 
P/CAF protein and radiolabeled acetyl-CoA (e.g., [l- 14 C]acetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 
as described herein in the Examples. Thus, the compound to be tested for the ability to 
10 inhibit the histone acetyltransferase activity of P/CAF can be added to this system and 
assayed for inhibiting ability. 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the transcription modulating activity of P/CAF, comprising contacting a 

15 system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 
amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
amount of histone acetylation by P/CAF in the absence of the substance, a decreased 

20 amount of histone acetylation by P/CAF in the presence of the substance indicating a 

substance that can inhibit the transcription modulating activity and cell cycle progression 
suppressing activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 

25 octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 
(e.g., [l- 14 C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
Thus, the compound to be tested for the ability to inhibit the transcription modulating 
activity of P/CAF by interfering with the histone acetyltransferase activity of P/CAF can 

30 be added to this system and assayed for inhibiting ability. 
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Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 and P/CAF can occur; determining the amount 
5 of p300 binding to P/CAF in the presence of the substance; and comparing the amount 
of p300 binding to P/CAF in the presence of the substance with the amount of p300 
binding to P/CAF in the absence of the substance, a decreased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can inhibit the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
10 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the p300 protein comprising the amino acid sequence of SEQ ED NO:3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

15 

Additionally provided in the present invention is a bioassay for screening 
substances for the ability to inhibit the binding of CBP to P/CAF, comprising contacting 
a system in which the binding of CBP to P/CAF can be determined, with the substance 
under conditions whereby the binding of CBP to P/CAF can occur; determining the 

20 amount of CBP binding to P/CAF in the presence of the substance; and comparing the 
amount of CBP binding to P/CAF in the presence of the substance with the amount of 
CBP binding to P/CAF in the absence of the substance, a decreased amount of CBP 
binding to P/CAF in the presence of the substance indicating a substance that can inhibit 
the binding of CBP to P/CAF. The binding of CBP to P/CAF can be determined in a 

25 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the CBP protein comprising the amino acid sequence of SEQ ID NO: 9 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both CBP and P/CAF. Determination of the binding of CBP to P/CAF can be 
carried out as taught herein. 
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The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of P/CAF comprising 
contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
5 the substance; and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 
presence of the substance indicating a substance that can stimulate the histone 
acetyltransferase activity of P/CAF. The acetylation of histones by P/CAF can be 

10 determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl- 
CoA (e.g., [1- 14 C] acetyl Co A). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 

15 Thus, the compound to be tested for the ability to stimulate the histone acetyltransferase 
activity of P/C AF can be added to this system and assayed for stimulating ability. 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the transcription modulating activity of P/CAF comprising 

20 contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
the substance; and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 

25 presence of the substance indicating a substance that can stimulate the transcription 

modulating activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 
octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 

30 (e.g., [l- u C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
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Thus, the compound to be tested for the ability to stimulate the transcription modulating 
activity of P/CAF by increasing the histone acetyltransferase activity of P/CAF can be 
added to this system and assayed for stimulating ability. 

5 The present invention further provides a bioassay for screening substances for 

the ability to stimulate binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 to P/CAF can occur; determining the amount of 
p300 binding to P/CAF in the presence of the substance; and comparing the amount of 

10 p300 binding to P/CAF in the presence of the substance with the amount of p300 

binding to P/CAF in the absence of the substance, an increased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can stimulate the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
system, for example, which can include a cell free reaction mixture comprising a 

15 fragment of the p300 protein comprising the amino acid sequence of SEQ ID NO: 3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

Additionally provided in the present invention is a bioassay for screening 
substances for the ability to stimulate the binding of CBP to P/CAF, comprising 
contacting a system in which the binding of CBP to P/CAF can be determined, with the 
substance under conditions whereby the binding of CBP to P/CAF can occur; 
determining the amount of CBP binding to P/CAF in the presence of the substance; and 
comparing the amount of CBP binding to P/CAF in the presence of the substance with 
the amount of CBP binding to P/CAF in the absence of the substance, an increased 
amount of CBP binding to P/CAF in the presence of the substance indicating a 
substance that can stimulate the binding of CBP to P/CAF. The binding of CBP to 
P/CAF can be determined in a system, for example, which can include a cell free 
reaction mixture comprising a fragment of the CBP protein comprising the amino acid 
sequence of SEQ ID NO: 9 and P/CAF. Alternatively, the system can comprise a cell 
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extract produced from cells producing both CBP and P/CAF. Determination of the 
binding of CBP to P/CAF can be carried out as taught herein. 

Transcription modulating activity of P/CAF 
5 The present invention contemplates a method for inhibiting the transcription 

modulating activity of P/CAF in a subject, comprising administering to the subject a 
transcription modulating activity inhibiting amount of a substance in a pharmaceutically 
acceptable carrier. For example, the substance can be identified according to the 
protocols provided herein as one that can inhibit the transcription modulating activity of 

10 P/CAF by preventing the binding of P/CAF to p300/CBP or by inhibiting the histone 
acetyltransferase activity of P/CAF as well as by any other inhibitory mechanism as 
identified by the protocols provided herein. Inhibition of the transcription modulating 
activity of P/CAF in a subject is desirable, for example, to inhibit HIV TAT-mediated 
transcription and therefore, the method of the present invention can be used to treat 

1 5 HIV-infected subjects. 

The substance can be in a pharmaceutically acceptable carrier. By 
"pharmaceutically acceptable" is meant a material that is not biologically or otherwise 
undesirable, i.e., the material may be administered to a subject, along with the substance, 
20 without causing any undesirable biological effects or interacting in a deleterious manner 
with any of the other components of the pharmaceutical composition in which it is 
contained. The carrier would naturally be selected to minimize any degradation of the 
active ingredient and to minimize any adverse side effects in the subject. 

25 The transcription modulating activity and/or histone acetyltransferase activity of 

P/CAF can be inhibited in a subject by administering to the subject a substance which 
binds p300/CBP at the P/CAF binding site or a substance which binds the P/CAF 
protein at the p300/CBP binding site, the ultimate result being that P/CAF and 
p300/CBP do not bind with one another and P/CAF cannot exert its transcription 

30 modulating and/or histone acetyltransferase effect. The substance can be a protein, such 
as an antibody which binds the P/CAF protein binding site at or near the p300/CBP 
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binding site, thereby preventing its binding or an antibody which binds the p300/CBP 
protein at or near the P/CAF binding site, thereby preventing its binding. The substance 
can also bind the histone acetyltransferase site on P/CAF or at the acetylation site on the 
histone, thereby preventing acetylation by P/CAF. 

5 

The substance which binds p300/CBP, the P/CAF protein or the histone and has 
the net effect of inhibiting the transcription modulating effect and or histone 
acetyltransferase activity of P/CAF in the cell can be delivered to a cell in the subject by 
mechanisms well known in the art. 

10 

Alternatively, a nucleic acid encoding a protein which binds either to p300/CBP 
or the P/CAF protein and has the net effect of inhibiting the transcription modulating 
effect and/or histone acetyltransferase activity of P/CAF in the cell can be delivered to a 
cell in the subject by gene transduction mechanisms well known in the art. For example, 
15 nucleic acid can be introduced by liposomes as well as via retroviral or adeno-associated 
viral vectors, as described below. 

The substance which inhibits the transcription modulating effect and/or histone 
acetyltransferase activity of P/CAF can be an antisense RNA or an antisense DNA which 

20 binds the RNA or DNA of P/CAF, thereby preventing translation or transcription of the 
RNA or DNA encoding P/CAF and having the net effect of inhibiting the transcription 
modulating effect and/or histone acetyltransferase activity of P/CAF by inhibiting P/CAF 
production. The antisense RNA of the present invention can be generated from the 
nucleic acid of SEQ ID NO: 14 (human) or SEQ ID NO: 15 (mouse). Furthermore, the 

25 antisense DNA can be a phosphorothioate oligodeoxyribonucleotide having the 

nucleotide sequence of SEQ ED NO: 16 (human) or of SEQ ID NO: 17 (mouse). The 
mouse antisense RNA can be used to inhibit the activity of mouse P/CAF, having the 
nucleotide sequence of SEQ ED NO: 18 and the amino acid sequence of SEQ ED NO: 8. 
The present invention also contemplates an antisense nucleic acid sequence which can 

30 bind the DNA or RNA of any of the transcription factors or other proteins now known 
or later identified to bind P/CAF, thereby inhibiting expression of the gene products of 
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these proteins and having the net effect of inhibiting the transcription modulating effect 
and/or histone acetyltransferase activity of P/CAF. 

The antisense nucleic acid can comprise a typical nucleic acid, but the antisense 
5 nucleic acid can also be a modified nucleic acid or a derivative of a nucleic acid such as a 
phosphorothioate analogue of a nucleic acid. The composition can comprise, for 
example, an antisense RNA that specifically binds an RNA encoded by the gene 
encoding the serum protein. Antisense RNAs can be synthesized and used by standard 
methods (62). 

10 

Antisense RNA can inhibit gene expression by forming an RNA/RNA duplex 
between the antisense RNA and the RNA transcribed from the target gene. The precise 
mechanism by which this duplex formation decreases the production of the protein ~ 
encoded by the endogenous gene probably involves binding of complementary regions 

15 of the normal sense mRNA and the antisense RNA strand with duplex formation in a 
manner that blocks RNA processing and translation. Alternative mechanisms include 
the formation of a triplex between the antisense RNA and duplex DNA or the formation 
of an DNA-RNA duplex with subsequent degradation of DNA-RNA hybrids by RNAse 
H. Furthermore, an antigene effect can result from certain DNA-based oligonucleotides 

20 via triple-helix formation between the oligomer and double-stranded DNA which results 
in the repression of gene transcription. Regardless of the specific molecular mechanism, 
the present invention results in inhibition of expression of the P/CAF gene by the 
introduced and replicated DNA resulting in inhibition of the transcription modulating 
and/or histone acetyltransferase activity of P/CAF, by a reduction in the expression of 

25 the nucleic acid to which the antisense nucleic acid is hybridized, and therefore a 
reduction of the gene product from the targeted gene. 

The antisense nucleic acid may be obtained by any number of techniques known 
to one skilled in the art. One method of constructing an antisense nucleic acid is to 
30 synthesize a recombinant antisense DNA molecule. For example, oligonucleotide 

synthesis procedures are routine in the art and oligonucleotides coding for a particular 
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protein or regulatory region are readily obtainable through automated DNA synthesis. 
A nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 
5 5* or 3' overhangs at the termini for cloning into an appropriate vector. Double-stranded 
molecules coding for relatively large proteins or regulatory regions can be synthesized 
by first constructing several different double-stranded molecules that code for particular 
regions of the protein or regulatory region, followed by ligating these DNA molecules 
together. Once the appropriate DNA molecule is synthesized, this DNA can be cloned 
10 downstream of a promoter in an antisense orientation. Techniques such as this are 
routine in the art and are well documented. 

An example of another method of obtaining an antisense nucleic acid is to isolate 
that nucleic acid from the organism in which it is found and clone it in an antisense 

15 orientation. For example, a DNA or cDNA library can be constructed and screened for 
the presence of the nucleic acid of interest. Methods of constructing and screening such 
libraries are well known in the art and kits for performing the construction and screening 
steps are commercially available (for example, Stratagene Cloning Systems, La Jolla, 
CA). Once isolated, the nucleic acid can be directly cloned into an appropriate vector in 

20 an antisense orientation, or if necessary, be modified to facilitate the subsequent cloning 
steps. Such modification steps are routine, an example of which is the addition of 
oligonucleotide linkers which contain restriction sites to the termini of the nucleic acid. 
General methods are set forth in Sambrook et al (39). 

25 The DNA that is introduced into the cell is in an expression orientation that is 

antisense to a corresponding endogenous DNA or RNA of the cells. For example, 
where an endogenous DNA comprises a gene which encodes for a particular protein, the 
introduced DNA is in an expression orientation opposite the expression of the 
endogenous DNA; that is the DNA operatively linked to a promoter is in an antisense 

30 expression orientation relative to the corresponding endogenous gene. The introduced 
DNA may be homologous to the entire transcribed gene or homologous to only part of 
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the transcribed gene. Alternatively, the sequence of the introduced DNA may be 
divergent to that of the endogenous DNA but only divergent to the extent that 
hybridization of the nucleic acids occurs, thereby preventing transcription. One skilled 
in the art can determine the maximum extent of this divergence by routine screening of 
5 antisense DNAs corresponding to an endogenous DNA of the cell. In this manner, one 
skilled in the art can readily determine which fragments, or alternatively the extent of 
homology of the fragments or the entire gene that is necessary to inhibit gene 
expression. 

10 The antisense nucleic acids of the present invention can be made according to 

protocols standard in the art, as well as described in the Examples provided herein. The 
antisense nucleic acids can be administered to a subject according to the gene 
transduction protocols standard in the art, as described below, 

1 5 The present invention also contemplates a method for stimulating the t 

transcription modulating activity and/or histone acetyltransferase activity of P/CAF in a 
subject comprising administering to the subject a substance, in a pharmaceutical^ 
acceptable carrier, determined according to the methods taught herein, to have a 
stimulatory affect on the transcription modulating and/or histone acetyltransferase 

20 activity of P/CAF. The substance can be one which has been identified, according to the 
protocols provided herein, to stimulate histone acetyltransferase activity in P/CAF or 
promote binding of P/CAF to p300/CBP. The stimulation of the transcription 
modulation activity and/or histone acetyltransferase activity of P/CAF in a subject is 
desirable, for example, to activate tumor suppressor p53 (which promotes apoptosis) or 

25 to activate the muscle differentiation factor, MyoD. Thus, the method of the present 
invention can be employed to treat cancer and to promote muscle differentiation in 
conditions where muscle differentiation is desired. The substance can be delivered to a 
cell in the subject by mechanisms well known in the art. 

30 Further contemplated in the present invention is a method for promoting binding 

of P/CAF to p300/CBP in a subject, comprising administering to the subject a substance 
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identified by the methods provided herein to promote binding of P/CAF to either p300 
or CBP. 

Additionally, a nucleic acid encoding a protein which stimulates the transcription 
5 modulating activity and/or histone acetyltransferase activity of P/CAF can be delivered 
to a cell in the subject by gene transduction mechanisms, as described below. 

Also provided in the present invention is a method of inhibiting the cell cycle 
progression inducing effect of an oncoprotein which binds p300/CBP in a subject 

10 comprising transducing the cells of the subject with a vector comprising a nucleic acid 
encoding the P/CAF protein; inducing expression of the nucleic acid in the cell to 
produce the P/CAF in an amount which will allow the P/CAF gene product to replace 
the oncoprotein bound to p300/CBP, whereby the replacement of the oncoprotein 
bound to p300/CBP by the P/CAF gene product inhibits the cell cycle progression 

15 inducing effect of the oncoprotein. The oncoprotein which binds p300/CBP in the cell 
can be the adenovirus El A oncoprotein. 

A method for providing a functional P/CAF protein to a subject in need of the 
functional P/CAF protein is also provided, comprising transducing the cells of the 

20 subject with a vector comprising a nucleic acid encoding the P/CAF protein and 

inducing expression of the nucleic acid to produce the functional P/CAF protein in the 
cell, thereby providing the functional P/CAF protein to the subject. The transduction of 
the vector nucleic acid into the subject's cells can be carried out according to standard 
gene therapy protocols well known in the art (see, for example, U.S. Patent No. 

25 5,339,346). 

Screening assays for p300/CBP 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of p300/CBP comprising 
30 contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance under conditions whereby histone acetylation by p300/CBP can occur; 
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determining the amount of histone acetylation by p300/CBP in the presence of the 
substance, and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased amount of histone acetylation by p300/CBP in the 
5 presence of the substance indicating a substance that can inhibit the histone 

acetyltransferase activity of p300/CBP. The acetylation of histones by p300/CBP can be 
determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P300/CBP protein and radiolabeled 
10 acetyl-CoA (e.g., [l- 14 C]acetyl Co A). The presence of acetylated histones can be 

detected by autoradiography after separation by SDS-PAGE as described herein in the 
Examples. Thus, the compound to be tested for the ability to inhibit the histone 
acetyltransferase activity of p300/CBP can be added to this system and assayed for 
acetyltransferase inhibiting ability. 

15 

Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of a transcriptional factor to p300/CBP, comprising 
contacting a system in which the binding of a transcriptional factor to p300/CBP can be 
determined, with the substance under conditions whereby the binding of the 

20 transcriptional factor and p300/CBP can occur; determining the amount of 

transcriptional factor binding to p300/CBP in the presence of the substance; and 
comparing the amount of transcriptional factor binding to p300/CBP in the presence of 
the substance with the amount of transcriptional factor binding to p300/CBP in the 
absence of the substance, a decreased amount of transcriptional factor binding to 

25 p300/CBP in the presence of the substance indicating a substance that can inhibit the 
binding of a transcriptional factor to p300/CBP. The binding of a transcriptional factor 
to p300/CBP can be determined in a system, for example, which can include a cell free 
reaction mixture comprising a transcriptional factor which binds p300/CBP and 
p300/CBP. Alternatively, the system can comprise a cell extract produced from cells 

30 producing both a transcriptional factor which binds p300/CBP and p300/CBP. The 
transcriptional factor which binds p300/CBP can be selected from, but is not limited to 
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the group consisting of nuclear hormone receptors, CREB, c-Jun/v-Jun, c-Myb/v-Myb, 
YYI, Sap- la, c-Fos, MyoD and SRC-1, as well as any other transcriptional factor now 
known or later identified to bind p300/CBP. The screening assay of the present 
invention can also be used to identify substances which inhibit the binding of p300/CBP 
5 to other components to which it is known to bind, for example, P/CAF, pp90 RSK , TFIIB, 
El A, SV40 large T antigen, as well as any other substances now known or later 
identified to bind p300/CBP. Determination of the binding of a transcriptional factor or 
other substance to p300/CBP can be carried out as taught in the Examples herein as well 
as by protocols described in the literature. 

10 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of p300/CBP comprising 
contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance; determining the amount of histone acetylation by p300/CBP in the 

15 presence of the substance; and comparing the amount of histone acetylation by 

p300/CBP in the presence of the substance with the amount of histone acetylation by 
p300/CBP in the absence of the substance, an increased amount of histone acetylation 
by p300/CBP in the presence of the substance indicating a substance that can stimulate 
the histone acetyltransferase activity of p300/CBP. The acetylation of histones by 

20 p300/CBP can be determined in a system including, for example, either core histones 
(histones H2A, H2B, H3 and H4) or the nucleosome core particles (146 base pairs of 
DNA wrapped around the octamer of core histones) as substrates, the p300/CBP 
protein and radiolabeled acetyl-CoA (e.g., [l- 14 C]acetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 

25 as described herein in the Examples. Thus, the compound to be tested for the ability to 
stimulate the histone acetyltransferase activity of p300/CBP can be added to this system 
and assayed for stimulating ability. 



30 



The present invention further provides a bioassay for screening substances for 
the ability to stimulate binding of a component, which binds p300/CBP, to p300/CBP, 
comprising contacting a system in which the binding of the component to p300/CBP can 
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be determined, with the substance under conditions whereby the binding of the 
component to p300/CBP can occur; determining the amount of component binding to 
p300/CBP in the presence of the substance; and comparing the amount of component 
binding to p300/CBP in the presence of the substance with the amount of component 
5 binding to p300/CBP in the absence of the substance, an increased amount of 

component' binding to p300/CBP in the presence of the substance indicating a substance 
that can stimulate the binding of the component to p300/CBP. The binding of the 
component to p300/CBP can be determined in a system, for example, which can include 
a cell free reaction mixture comprising the component and p300/CBP. Alternatively, the 

10 system can comprise a cell extract produced from cells producing both the component 
and p300/CBP. The component which binds p300/CBP can be any of the transcriptional 
factors or other proteins which are known or are identified in the future to bind 
p300/CBP, as set forth above. Determination of the binding of the component to 
p300/CBP can be carried out as taught in the Examples provided herein and according 

15 to protocols available in the literature. 

Histone acetyltransferase activity of p300/CBP 

A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject is provided in the present invention, comprising administering to the subject-a 
20 histone acetyltransferase activity inhibiting amount of a substance in a pharmaceutically 
acceptable carrier. The mechanism of the inhibitory action of the substance can be the 
inhibition of the binding of a DNA-binding transcription factor, such as, for example, a 
nuclear hormone receptor, CREB, c-Jun/v-Jun, c-Myb/v-Myb, YY1, Sap- la, c-Fos, 
MyoD or SRC-1, to p300/CBP. 

25 

The histone acetyltransferase activity of p300/CBP can be inhibited in a subject 
by administering to the subject a substance which binds p300/CBP at the transcription 
factor binding site or a substance which binds the transcription factor protein at the 
p300/CBP binding site, the ultimate result being that the transcription factor and 
30 p300/CBP do not bind with one another and p300/CBP cannot acetylate histones. 
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The substance which binds either to the transcription factor or the p300/CBP 
protein and has the net effect of inhibiting the histone acetyltransferase activity of 
p300/CBP in the cell can be identified according to the screening methods provided 
herein and delivered to a cell in the subject by mechanisms well known in the art. The 
5 substance can be a protein, such as an antibody which binds the p300/CBP protein 
binding site at or near the DNA-binding transcription factor binding site, thereby 
preventing its binding or an antibody which binds the DNA-binding transcription factor 
at or near the p300/CBP binding site, thereby preventing its binding. The substance can 
also bind the histone acetyltransferase site on p300/CBP (aa 1 195-1673 on p300 or aa 
10 1 174-1850 on CBP) or at the acetylation site on the histone, thereby preventing 
acetylation by p300/CBP. 

Additionally, the substance can be a nucleic acid which can be expressed in the 
cell to produce a protein which inhibits the histone acetyltransferase activity of 

15 p300/CBP, For example, a nucleic acid encoding a protein which binds either to a 
transcription factor or the p300/CBP protein and has the net effect of inhibiting the 
histone acetyltransferase activity of p300/CBP in the cell can be delivered to a cell in the 
subject by gene transduction mechanisms well known in the art. For example, nucleic 
acid can be introduced by liposomes as well as via retroviral or adeno-associated viral 

20 vectors, as described below. 

The substance which inhibits the histone acetyltransferase activity of p300/CBP 
can be an antisense RNA or an antisense DNA which binds the RNA or DNA of 
p300/CBP thereby preventing translation or transcription of the RNA or DNA encoding 
25 p300/CBP and having the net effect of inhibiting the histone acetyltransferase activity of 
P/CAF by inhibiting p300/CBP production The antisense RNA or DNA of the present 
invention can be produced and introduced into cells according to the same methods as 
set forth above for P/CAF antisense nucleic acids 

30 The present invention also contemplates a method for stimulating the histone 

acetyltransferase activity of p300/CBP in a subject comprising administering to the 
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subject a histone acetyltransferase activity stimulating amount of a substance, in a 
pharmaceutical^ acceptable carrier, determined according to the methods taught herein, 
to have a stimulatory affect on the histone acetyltransferase activity of p300/CBP. The 
substance can exert a stimulatory effect by promoting the binding of a DNA-binding 
5 transcription factor of the present invention to p300/CBP. The substance can be 
delivered to a cell in the subject by mechanisms well known in the art. A nucleic acid 
encoding a protein which stimulates the transcription modulating activity of p300/CBP 
can be delivered to a cell in the subject by gene transduction mechanisms, as described 
below. 

10 

Gene transduction 

In the methods described above which include gene transduction into cells (i.e., 
addition of exogenous DNA into cells), the nucleic acids of the present invention can be 
in a vector for delivering the nucleic acids to the site for expression of the P/CAF 

1 5 protein. The vector can be one of the commercially available preparations, such as the 
pGM plasmid (Promega). Vector delivery can be by liposome, using commercially 
available liposome preparations or newly developed liposomes having the features of the 
present liposomes. Additionally, vector delivery can be via a viral system, including, but 
not limited to, retroviral, adenoviral and adeno-associated viral systems. Other delivery 

20 methods can be adopted and routinely tested according to the methods taught herein. 

The modes of administration of the liposome will vary predictably according to 
the disease being treated and the tissue being targeted. For example, for treating cancer 
in either the lung or the liver, which are both sinks for liposomes, intravenous delivery is 

25 reasonable. For other localized cancers, as well as precancerous conditions, 

catheterization of an artery upstream from the target organ is a preferred mode of 
delivery, because it avoids significant clearance of the liposome by the lung and liver. 
For cancerous lesions at a number of other sites (e.g., skin cancer, localized dysplasias), 
topical delivery is expected to be effective and may be preferred, because of its 

30 convenience. 
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Leukemias and other disorders involving dysregulated proliferation of certain 
isolatable cell populations may be more readily treated by ex vivo administration of the 
nucleic acid. 

5 The liposomes may be administered topically, parenterally (e.g., intravenously), 

by intramuscular injection, by intraperitoneal injection, transdermally, extracorporeal^ 
or the like, although intravenous or topical administration is typically preferred. The 
exact amount of the liposomes required will vary from subject to subject, depending on 
the species, age, weight and general condition of the subject, the severity of the disease 
10 being treated, the particular compound used, its mode of administration and the like. 
Thus, it is not possible to specify an exact amount. However, an appropriate amount 
may be determined by one of ordinary skill in the art using only routine experimentation 
given the teachings herein. 

15 Parenteral administration, if used, is generally characterized by injection. 

Injectables can be prepared in conventional forms, either as liquid solutions or 
suspensions, solid forms suitable for solution of suspension in liquid prior to injection, or 
as emulsions. A more recently revised approach for parenteral administration involves 
use of a slow release or sustained release system such that a constant level of dosage is 

20 maintained. See, e.g., U.S. Patent No. 3,610,795, which is incorporated by reference 
herein. 

Topical administration can be by creams, gels, suppositories and the like. Ex 
vivo (extracorporeal) delivery can be as typically used in other contexts. 
25 . 

The present invention is more particularly described in the following examples 
which are intended as illustrative only since numerous modifications and variations 
therein will be apparent to those skilled in the art. 



30 
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EXAMPLES 

I. P/CAF studies. 

5 Cloning and characterization of P/CAF protein. 

In human cells, CBP binds to c-Jun in a phosphorylation-dependent manner in 
association with stimulation of transcription (9). In yeast, GCN4 is believed to be a c- 
Jun counterpart on the basis of similarities in DNA recognition (15) as well as the 
participation of both proteins in UV signaling pathways (16). Yeast genetic screening 

10 has led to the isolation of various cofactors for GCN4, including GCN5 (yGCN5), 
ADA2 (yADA2) and AD A3 (yADA3) (17-19). These factors are considered to 
function as a complex (or in a common pathway) based on genetic and protein-protein 
interaction studies (18-22). Finally, p300/CBP and yADA2 exhibit significant sequence 
similarity within a 50 amino acid region including a Zn 2+ finger motif (3). Human 

15 counterparts to yGCN5, yADA2, or yADA3 that interact with p300/CBP to mediate 
transcriptional activation by c-Jun were searched for in various nucleotide sequence 
databases. 

Comparison of the yGCN5 protein sequence with various databases (23) — * 
20 revealed significant similarities with the two randomly sequenced human cDNAs, 

ETS05039 (24) (P=4.0xl0 15 ) and NIB2000-5R (P=6.5xl0~ 9 ). Given that these cDNAs 
were truncated, human fetal liver and fetal brain cDNA libraries (Clontech) were 
screened with ETS05039 and NEB2000-5R, respectively and complete clones were 
isolated from the human fetal liver cDNA library. The complete sequences revealed that 
25 the ETS05039- and NLB2000-5R-derived clones are encoded by distinct genes but are 
highly related within the protein coding regions (68% identity at the DNA level; 75% 
identity and 86% similarity at the protein level). The former encodes an N-terminal 
region with no sequence similarity to any proteins in the databases besides the yGCN5- 
related C-terminal region, whereas the latter encodes only the yGCN5 -related region. 
30 Given that p300/CBP-binding activity was observed in the former polypeptide as shown 
below, it was designated p300/CBP-associated factor (P/CAF), having the amino acid 
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sequence of SEQ ID NO: 1 and the nucleotide sequence of SEQ ID NO: 10 and the latter 
was named human GCN5 (hGCNS ), having the amino acid sequence of SEQ ID NO: 5 
and the nucleotide sequence of SEQ ID NO: 1 1 . 



10 



Additionally, an RNA blot (Clontech) was hybridized with a random-primed 
probe made from the cDNA encoding P/CAF. RNA blotting indicated that transcripts 
detected by the P/CAF and hGCNS cDNAs are ubiquitously expressed, but the former is 
most abundant in heart and skeletal muscle, whereas the latter is most abundant in 
pancreas and skeletal muscle. 



P/CAF-p300/CBP interaction in vitro 

The P/CAF binding site was presumed to reside in the C terminal one third of 
CBP (residues 1,678-2,442) because it was observed that this region, when fused to a 
DNA binding domain, activates transcription (4) in a manner repressed by coexpression 
15 of 12S El A. This region was divided into 6 overlapping fragments and each was 
expressed inE. coli as a glutathione-S-transferase (GST) fusion protein. GST-CBP 
fusions were incubated with recombinant P/CAF protein and, subsequently, purified 
using glutathione-Sepharose. Co-purified P/CAF was detected by immunoblotting 
analysis. 

20 

To construct GST-fusions, various regions of CBP and p300 were amplified by 
PCR. A series of deletions of the CBP segment B was created by site-directed in vitro 
mutagenesis (30). These fragments were subcloned into pGEX-2T (Pharmacia). GST- 
fusions were expressed in E. coli and extracted with buffer B [20 mM Tris-HCl (pH 

25 8.0), 5 mM MgCl 2 , 10% glycerol, 1 mM AEBSF, 0.1% NP40, 10 jig/ml of aprotinin, 10 
Hg/ml of leupeptin, 1 ng/ml of pepstatin A, 1 mM DTT] containing 0.1 M KG for these 
experiments. GST-CBP-segment B was purified by glutathione-Sepharose and phenyl- 
Sepharose chromatographic steps, P/CAF, hGCNS, and El A were expressed as FLAG- 
fusions in Sf9 cells via baculovirus vectors and affinity-purified with M2-agarose (ref 

30 30; Kodak-IBI). For interaction, a crude E. coli extract containing 20 pmol of GST- 
fiision was incubated with 40-60 pmol of P/CAF or El A in a total volume of 50 |il of 
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buffer B with 0. 1 M KC1 on ice for 10 min. Samples were further incubated with 10 |il 
(packed volume) of glutathione-Sepharose at 4 °C for 30 min, washed four times with 
200 ix\ of buffer B containing 0. 1 M KC1, and eluted with 20 ^1 of buffer E [50 mM 
Tris-HCl (pH 8.0), 0.2 M KC1, 20 mM glutathione] for 60 min. Interacting proteins 
5 were detected by anti-FLAG immunoblotting or silver staining. 

For p300 interactions, the segment spanning residues 1763-1966 (segment B') of 
p300, which is analogous to the CBP segment -B, was used. Twenty percent of the 
P/CAF and hGCN5 inputs and 100% of the El A input were also analyzed. In the GST 
10 precipitation assays, almost identical amounts of the GST fusions were recovered in all 
samples. Interaction between P/CAF and CBP (segment B) was determined in the 
absence and in the presence of El A. Control reactions with GST-CBP alone and 
without GST-CBP were also performed. Input proteins were analyzed. 

15 Two CBP segments, A and B, interacted specifically with P/CAF. The stronger 

interaction was observed in the latter segment, which does not include the yADA2-like 
Zn 2+ finger. Given that the CBP segment-B is well conserved in p300 (66% identity, 
75% similarity), the binding of P/CAF to p300 in vitro was also analyzed. For this \ 
experiment, the p300 segment spanning residues 1763-1966, termed segment B\ which 

20 is analogous to the CBP segment-B, was used. Like CBP, p300 interacted specifically 
with P/CAF. These studies demonstrated that P/CAF binds specifically to both p300 
and CBP in vitro. In contrast to P/CAF, hGCN5 did not bind to CBP or p300. 

These studies also demonstrated that the Zn 2+ finger region of p300/CBP, which 
25 shares sequence similarity with yADA2, is not essential for the interaction with P/CAF. 
Cloning of a human structural homolog of yADA2, termed hADA2 (25) has revealed 
that, unlike the sequence similarity between p300/CBP and yADA2, which is restricted 
to a 50 amino acid region, hADA2 shares extensive similarity (30% identity, 52% 
similarity) to yADA2 over the entire protein sequence. Moreover, a computer search of 
30 the complete genomic sequence of Saccharomyces cerevisiae revealed that yeast does 
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not have counterparts of p300/CBP or P/CAF. Thus, the p300/CBP-P/CAF pathway 
may have been acquired during metazoan evolution. 



5 Action of E1A in vitro 

Previous reports indicated that El A binds to both the p300 segment spanning 
residues 1767-1816 and the CBP segment spanning residues 1805-1854 (7). These 
interactions were reconfirmed in the present system; thus, both p300 and CBP segments 
covering the previously identified regions interacted with El A. 

10 

For further mapping, a series of deletions was introduced within the CBP 
segment-B and tested for interactions with P/CAF and El A. Deletions of residues 
1801-1825 or 1824-1851 markedly reduced interactions with both P/CAF and El A, 
whereas deletion of residues 1850-1878 did not affect these interactions. Furthermore, 
15 deletion of residues 1801-1851 completely abolished interactions with both P/CAF and 
El A. These data indicate that residues 1801-1851 of CBP are critical for interaction 
with both P/CAF and El A. Taken together with the evidence that CBP segment A (aa 
residues 1,678-1,880) also binds to these factors, the above findings demonstrate that 
P/CAF and El A bind to the same or very closely spaced sites on CBP. 

20 

Evidence that both P/CAF and El A recognize the same p300/CBP segments 
raises the possibility of direct competition between P/CAF and El A for binding to 
p300/CBP. To test this possibility, a competition experiment was performed with the 
use of affinity purified recombinant proteins. The interaction of P/CAF with the CBP- 
25 segment B was progressively inhibited by the addition of increasing amounts of El A. In 
contrast, no inhibition was caused by an El A mutant which does not bind to p300/CBP 
(El AAN). Similar results were obtained with the p300-segment B', leading to the 
conclusion that P/CAF and El A compete for the same binding sites in p300/CBP. 
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P/CAF-p300/CBP interaction in vivo 

The in vivo interaction between P/CAF and p300/CBP was established by co~ 
immunoprecipitation from a human osteosarcoma cell extract. Proteins in this extract 
were immunoprecipitated with rabbit anti-P/CAF, rabbit anti-CBP and anti-p300 
5 antibodies. For controls, cell extract was precipitated with rabbit control IgG or mouse 
anti-HA monoclonal antibody. The precipitates were analyzed by immunoblotting with 
anti-P/CAF, anti-CBP and anti-p300 antibodies. 

Osteosarcoma cells were transfected with either control vector or El A- or 
10 El AAN-expression vectors. Extract from the transfected subpopulation was 

immunoprecipitated with anti-P/CAF or control IgG. The precipitates were analyzed by 
immunoblotting with anti-p300 and anti-P/CAF antibodies. 



Rabbit anti-P/CAF antibody was raised to the P/CAF segment spanning residues 

15 125-397 and purified by immunoaffinity chromatography (33). A mixture of 

monoclonal antibodies raised to the human p300 segment spanning residues 1572-2371 
(5) and rabbit polyclonal antibodies raised to the mouse CBP segment spanning residues 
2-23 (for immunoprecipitation) and 1736-2179 (immunoblotting) were purchased from 
Upstate Biotechnology. Approximately 2 x 10 7 human osteosarcoma U-2 OS cells 

20 (ATCC accession number HTB 96) were extracted with 10 ml of lysis buffer [25 mM 
HEPES-KOH (pH 7.2), 150 mM potassium acetate, 2 mM EDTA, 1 mM DTT, 1 mM 
AEBSF, 10 ng/ml of aprotinin, 10 ^ml of leupeptin, 1 |ig/ml of pepstatin A, 20 mM 
sodium fluoride, 0. 1% NP40]. Two to 10 ml of extract were incubated with 2 ^ig of the 
respective antibody for four hours at 4°C. Fifty |il (packed volume) of protein- A 

25 Trisacryl (Pierce) were added and incubation was continued for two hours. The matrix 
was washed four times with 1 ml of the lysis buffer, then boiled in 2x SDS sample 
buffer. Human osteosarcoma U-2 OS cells were transfected with 20 \ig of the indicated 
plasmid and 1 jag of sorting plasmid (pCMV-IL2R)(31). The transfected subpopulation 
was purified by magnetic affinity cell sorting (32). Extract from approximately 2 x 10 5 

30 sorted cells was immunoprecipitated as described. 
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Anti-P/CAF antibody specifically detected a 95 kDa protein, which is very close 
to the calculated value for the full-length P/CAF, in the immunoprecipitates. Anti- 
P/CAF antibody co-immunoprecipitated both CBP and p300. Similarly, anti-CBP 
antibody also co-immunoprecipitated P/CAF. However, anti-p300 antibody did not co- 
5 immunoprecipitate P/CAF. This is most likely due to steric interference since the anti- 
p300 antibody was raised to the p300 segment spanning residues 1572-2371 which 
includes the P/CAF binding region. These data demonstrate that P/CAF forms 
complexes with both p300 and CBP in vivo. 

1 0 Action of El A in vivo 

The in vitro experiments described herein indicate that P/CAF and El A compete 
for the binding sites in p300/CBP. Thus, a study was conducted to determine whether 
El A targets the endogenous interaction between P/CAF and p300. An El A-expression 
vector was transiently transfected into human osteosarcoma cells and the transfected 

15 subpopulation was purified by cell sorting. Then, the interaction between P/CAF and 
p300 in transfected cells was examined by co-immunoprecipitation with anti-P/CAF 
antibody. The endogenous interaction of P/CAF with p300 was drastically inhibited by 
expression of El A. On the other hand, no inhibition was observed by the El A mutant 
lacking the p300 binding domain (E1AAN), indicating that El A disrupts the P/CAF - 

20 p300 complex in vivo through an interaction with p300. 

Cell cycle regulation by P/CAF 

Given that binding of P/CAF to p300/CBP is inhibited by El A, experiments 
were performed to evaluate whether P/CAF, by binding to and forming a functional 
25 complex with p300, is involved in the regulation of entry into S phase. This possibility 
was addressed by examining whether transient expression of P/CAF would affect the 
rate of Gl/S transit in HeLa cells. P/CAF negatively affected the distribution of cells 
between Gl and S phases in this assay. 



30 



HeLa cells were transfected by electroporation with 7 ng of P/CAF-expression 
plasmid and/or 3 \xg of the full-length or the N-terminally deleted (A2-36) El A 12S- 
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expression plasmid as indicated. These plasmids were constructed by subcloning 
FLAG-P/CAF and El A cDNAs into pCX (34) and pcDNAI (Invitrogen), respectively. 
All samples, in addition, contained 1 \ig of sorting plasmid (pCMV-IL2R) (31) and 
carrier plasmid (pCX) to normalize the total amount of DNA to 1 1 ^g. After 
5 transfection, cells were incubated in Dulbecco's modified Eagle's medium with 1 0% fetal 
bovine calf serum for 12 h, and subsequently labeled in medium containing 10 ixM 
bromo-deoxyuridine (BrdU) for 30 min. Subsequently, the transfected subpopulation 
was purified by magnetic affinity cell sorting and nuclei were analyzed by dual parameter 
flow cytometry as described (32). 

10 

The fraction of cells accumulating in S phase in control cultures was 23%, 
compared to 1 5% in P/CAF-transfected cells. This effect was reproducible in multiple 
independent experiments. In parallel experiments to verify the utility of this 
experimental protocol, plasmids encoding E2F-1, simian virus 40 small t, cyclin A or 
1 5 cyclin E increased the accumulation of cells in S phase, whereas plasmids encoding the 
cyclin-dependent kinase inhibitors p21 or p27 reduced the number of S phase cells/ 

On the basis of evidence that El A and P/CAF compete for binding sites on 
p300, it seemed possible that cotransfection of P/CAF with El A would oppose the 

20 mitogenic effect caused by El A. As shown by the data herein, this is indeed the case. 
El A alone has mitogenic activity in this experimental setting, while the El A mutant 
lacking the p300 binding domain (El AAN) has very weak activity. Comparable 
expression levels between wild type and mutant El A in the transfected cells were 
revealed by immunoblotting analysis with anti-El A. Intriguingly, when P/CAF was 

25 cotransfected with El A, the mitogenic activity of El A was significantly counteracted by 
P/CAF. These results show that P/CAF and E 1 A mediate antagonistic effects on cell 
cycle progression. 

In the course of assessing P/CAF activity, it was also revealed that p300 is able 
30 to inhibit cell cycle progression under the same assay conditions. These findings suggest 
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that P/CAF and p300, perhaps by forming a complex, act in concert to suppress cell 
cycle progression. 

Histone acetyltransferase activity in P/CAF 

5 Acetylation of the N-terminal histone tails has been considered to play a crucial 

role in accessibility of transcription factors to nucleosomal templates (26-27). Recently, 
yGCN5 has been identified as a histone acetyltransferase (28). On the basis of this 
information, intrinsic histone acetyltransferase activity in P/CAF and hGCN5 was 
examined. As substrates, the core histones (histones H2A, H2B, H3 and H4) and the 
1 0 nucleosome core particles (146 base pairs of DNA wrapped around the octamer of core 
histones) were used. 

Activity of hGCN5 and P/CAF that acetylates free histones or histones in the 
nucleosome core particle (35) was measured as described (36). Each reaction contained 
15 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol of the histone 
octamer or the nucleosome core particle and 10 pmol of [l- 14 C]acetyl-CoA. The 
histone octamer dissociated into dimers or tetramers under assay conditions. Acetylated 
histones were detected by autoradiography after separation by SDS-P AGE. 

20 P/CAF and hGCN5 acetylated the core histones with almost the same efficiency. 

Both factors acetylated histones H3 and H4, but preferentially H3 , In contrast, very 

weak or no acetylation by hGCN5 was detected in the nucleosome core particles. 

Remarkably, significant acetylation by P/CAF was observed in a nucleosomal context. 

Although all core histones are acetylated in the nucleus, P/CAF and hGCN5 did not 
25 acetylate histones H2A and H2B in vitro 

Direct function of P/CAF is likelv to involve its intrinsic histone acetyltransferase 
activity. Although exact molecular mechanisms by which acetylation of core histones 
contribute to transcription remains undefined, acetylation of the histones is considered to 
30 play an important role in transcriptional regulation (26-27). The positively charged N- 
terminal tails of core histones are believed to affect nucleosome structure by interacting 
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with DNA at or near the nucieosome-spacer junction. Acetyiation of the histone tails 
presumably destabilizes the nucleosome and facilitates access by regulatory factors. 
Likewise, there is a general correlation between the level of acetyiation and 
transcriptional activity of nucleosomal domains. The findings of the present invention 
5 provide insights into the mechanisms of targeted histone acetyiation. 

Cellular factor p300/CBP binds to various sequence-specific factors that are 
involved in cell growth and/or differentiation, including CREB (3,4), c-Jun (9), Fos (11), 
c-Myb (12) and nuclear receptors (13). P/CAF could stimulate the activation function 
10 of these factors via promoter-specific histone acetyiation. The present invention 

demonstrates that El A appears to perturb normal cellular regulation by disrupting the 
connection between p300/CBP and its associated histone acetyltransferase. 

II. P300/CBP studies. 

15 

Purification of E1A associated histone acetyltransferase. 

FLAG-epitope tagged El A (or AE1 A) was expressed in Sf9 cells (ATCC 
accession number CRL 171 1) by infecting recombinant baculovirus (43). All purification 
steps were carried out at 4°C. Extract was prepared from infected cells by one cycle of 

20 freeze and thaw in buffer B (20 mM Tris-HCl, pH 8 .0; 5 mM MgCl 2 ; 10% glycerol; 1 
mM PMSF; 10 mMp-mercaptoethanol; 0. 1% Tween 20) containing 0. 1 
M KC1 and the complete protease inhibitor cocktail (Boehringer Mannheim). To 
prepare ElA-immobilized beads, the extract was incubated with M2 
anti-FLAG antibody agarose (Kodak-LBI) for four hours with rotating and 

25 subsequently washed with the same buffer three times. The resulting beads were 

incubated with HeLa (ATCC accession number CCL 2) nuclear extract for four to eight 
hours and thereafter washed with the same buffer six times. Finally, FLAG-E1 A was 
eluted from the beads along with associated polypeptides by incubating with the same 
buffer containing 0. 1 mg/ml FLAG peptide. 
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For further purification, eluted polypeptides were dialyzed in 0.05 M KCl-buffer 
B and subsequently loaded onto a SMART Mono Q column (Pharmacia) equilibrated 
with the same 0.05 M KCl-buffer B. After washing, the column was developed with a 
linear gradient of 0.05-1.0 M KC1 in buffer B. Mono Q fractions were concentrated with 
5 a MICROCON spin-filter (Amicon) and consequently loaded onto a SMART Superdex 
200 column (Pharmacia) equilibrated with 0. 1 M KCl-buffer B. 

Histone acetyltransferase assays 

Filter binding assays were performed as described (80) with minor modifications. 

10 Samples were incubated at 30°C for 10-60 minutes in 30 ml of assay buffer containing 
50 mM Tris-HCl, pH 8.0; 10% glycerol; 1 mM DTT; 1 mM PMSF; 10 mM sodium 
butyrate; 6 pmol of [ 3 H]acetyl CoA (4.3 mCi/mmole, Amersham Life Science Inc.); and 
33 mg/ml of calf thymus histones (Sigma Chemical Co.). In experiments where synthetic 
peptides were substituted for core histones, 50 pmol of each peptide were used. After 

1 5 incubation, the reaction mixture was spotted onto Whatman P-8 1 phosphocellulose filter 
paper and washed for 30 minutes with 0.2 M sodium carbonate buffer pH 9.2 at room 
temperature with 2-3 changes of the buffer. The dried filters were counted in a liquid 
scintillation counter. 



20 PAGE analysis was done as above except that 90 pmol of [ 14 C] acetyl CoA (55 

mCi/mmole, Amersham Life Science Inc.) and 9 pmol of core histones or 
mononucleosomes were used. Core histones and mononucleosomes were prepared as 
described (35). For trypsin digestion, reaction mixtures were further incubated with 
various amounts of trypsin on ice for 30 minutes. The samples were analyzed on one 

25 dimensional SDS-PAGE gels or two dimensional gels, where the first dimension was an 
acid-urea-PAGE gel (44) and the second dimension was an SDS-PAGE gel. 

Protein expression 

For baculovirus expression, cDNAs corresponding to p300 portions of aa 1-670, 
30 aa 671-1 194 and aa 1 135-2414 were amplified by PCR (EXPAND High Fidelity PCR 
System; Boehringer Mannheim) as KpnI-NotI fragments. The resulting fragments were 
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subcloned into a baculovirus transfer vector having the FL AG-tag sequence (43). The 
recombinant viruses were isolated using the BACULOGOLD system (Pharmingen), 
according to the manufacturer's protocol and were infected into Sf9 cells (ATCC 
accession number CRL 171 1) to express FLAG-p300. Recombinant proteins were 
5 affinity purified with M2 anti-FLAG antibody-immobilized agarose (Kodak-IBI) 
according to the manufacturer's protocol. 



For bacterial expression, cDNAs encoding the p300 portions and the CBP 
portion (aa 1 174-1850) were first subcloned into the baculovirus transfer vector having 
10 the FLAG-tag as described above. Thereafter, the Xhol and NotI fragments encoding 
FLAG-p300 or FLAG-CBP fusions were resubcloned into the£. coli expression vector 
pET-28c (Novagene) digested with Sail and NotI. Recombinant proteins were 
expressed in E. coli BL21(DE3) and affinity purified with M2-antibody agarose. 

15 Histone acetyltransferases that associate with E1A r 

Although the adenovirus El A 12S protein (El A) inhibits transcription in a. 
variety of genes via direct binding to p300/CBP (45), El A also stimulates transcription 
in some contexts (46). Thus, p300/CBP-bound El A was tested to determine whether it 
might recruit histone acetyltransferases or deacetylases to regulate transcription. In 
20 addition, experiments were conducted as described below to determine if p300/CBP per 
se is a histone acetyltransferase. 

Initially, recombinant FLAG-epitope tagged El A was immobilized on 
anti-FLAG antibody beads. Immobilized El A was incubated with a HeLa nuclear 

25 extract for affinity purification of El A- associated polypeptides. FLAG-E1A 
was then eluted from the beads, along with El A-associated polypeptides, by 
incubating with FLAG-peptide. Although El A per se has no histone acetyltransferase 
activity, El A recruited significant amounts of histone acetyltransferase activity from the 
nuclear extract. It is very unlikely that this activity is derived from P/CAF given that 

30 El A and P/CAF cannot bind to p300/CBP simultaneously (43). Consistent with this, no 
P/CAF was detected in these fractions by immunoblotting. 
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The El A N-terminus, a region that is not highly conserved among the various 
adenovirus serotypes, is involved in p300/CBP binding in vivo. Mutations in the 
N-terminal region lead to loss of the ability for p300/CBP binding without affecting RB 
5 binding (1,47). Thus, the requirement of the El A N-terminal region for the recruitment 
of histone acetyltransferase activity was tested. In contrast to the wild type, the 
N-terminal deleted form of El A (AN-E1 A) recruited only a background level of 
acetyltransferase activity. In agreement with previous reports (47), the AN-E1 A 
showed no ability to interact with p300/CBP, although it still retained the ability to 
10 interact with a variety of other polypeptides, including RB. 

To define the relationship between p300/CBP and histone acetyltransferase 
activity, affinity purified El A-binding polypeptides were separated by Mono Q 
ion-exchange column. Both p300/CBP and the acetyltransferase activity were coeluted 
15 at 140 mM KC1, while most of polypeptides were eluted at 260 mM KCL The active 
fraction of Mono Q column (-140 mM KC1) was further separated by Superdex-200 gel 
filtration column. Both p300/CBP and the acetyltransferase activity coeluted after the 
void volume, indicating that p300/CBP is involved in the histone acetyltransferase 
activity. 

20 

p300 is a histone acetyltransferase 

The data provided herein indicate that p300 per se, or a polypeptide(s) 
associated with p300, possesses histone acetyltransferase activity. To test the former 
possibility, the acetyltransferase activity of recombinant p300 was measured. p300 was 
25 divided into three fragments, each of which was expressed in and purified from Sf9 cells 
via a baculovirus expression vector. Histone acetyltransferase activity was readily 
detected in the C-terminal fragment containing amino acids 1 135-2414, whereas no 
activity was found in the other fragments, demonstrating conclusively that p300 per se is 
a histone acetyltransferase. 

30 
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p300/CBP-histone acetyltransferase domain 

To map the histone acetyltransferase domain of p300, a series of deletions 
was prepared. Given the poor conservation of the glutamine-rich region (aa 1815-2414) 
in the C elegans p300/CBP homolog (6), the p300 fragment encoding aa 1 135-1810 
5 was expressed in and purified from E. colt. Importantly, this candidate region of p300 
(aa 1 135-1810) showed significant histone acetyltransferase activity. For further 
mapping within this region, a series of N-terminal deletions was constructed. Deletion 
of 60 residues, resulting in a fragment containing aa 1 195-1810, had no effect on the 
acetyltransferase activity, whereas the deletion of 185 residues, yielding a fragment 
10 comprising aa residues 1320-1810, completely eliminated the acetyltransferase activity. 

Next, a series of C-terminal deletions was analyzed to determine the requirement 
of the P/CAF (or El A) -binding domain. The p300 fragments lacking the El A binding 
domain (aa 1 195-1760, 1 195-1706 and 1 195-1673) still retained the acetyltransferase 

15 activity, whereas the further truncated mutant (aa 1 195-1652) completely lost the . 

acetyltransferase activity. Consistent with these results, the internal deletion of residues 
1418-1720 showed no acetyltransferase activity. These data demonstrate that the 
histone acetyltransferase domain is located between the bromodomain and the 
El A-binding domain. Given that the histone acetyltransferase domain is highly 

20 conserved between p300 and CBP (91% similarity), the corresponding region of CBP, 
aa residues 1 174-1850, was expressed to confirm the acetyltransferase activity. As 
expected, comparable activity was detected, indicating that both p300 and CBP are 
histone acetyl transferases. 

25 Among various acetyltransferases including histone acetyltransferases GCN5 and 

P/CAF, putative acetyl-CoA binding sites are conserved (48). However, multiple 
alignment analysis (49) showed that the p300/CBP histone acetyltransferase domain 
does not belong to this group. Moreover, comparison of the p300/CBP histone 
acetyltransferase domain with peptide sequence databases (23) showed no sequence 

30 similarity to any other proteins. Accordingly, this invention shows that p300/CBP 

represents a novel class of acetyltransferases in that it does not have the conserved motif 
found among previously described acetyltransferases (48). 
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p300 acetylates all core histones in mononucleosomes 

Substrate specificity for acetylation by p300 was also examined. As substrates, 
historie octamers and mononucleosomes (146 base pairs of DNA wrapped around the 
octamer of core histones) were used. Given that the histone octamer dissociates into 
5 dimers or tetramers under physiological conditions, the histone octamer is referred to 
here as core histones. When core histones were used, p300 acetylated all four proteins, 
but preferentially H3 and H4. More importantly, in a nucleosomal context, p300 
acetylated all four core histones nearly stoichiometrically. In contrast, p300 acetylated 
neither BSA nor lysozyme. 

10 

Hyperacetylated histones are believed to be linked with transcriptionally active 
chromatin (26,27,50,51). Hyperacetylated forms are found in histones H4, H3 and H2B, 
which have multiple acetylation sites in vivo. Thus, the level of acetylation by p300 was 
also tested. 

15 

Mononucleosomes treated with p300 were analyzed by two-dimensional gel 
electrophoresis. A Coomassie blue-stained gel and the corresponding autoradiogram 
showed that a significant amount of histones, especially H4, were hyperacetylated. 
Importantly, acetylation levels by p300 were very close to those of hyperacetylated 
20 histones prepared from HeLa nuclei treated with sodium butyrate, a histone deacetylase 
inhibitor. In contrast, no acetylated forms were detected in the reaction 
without p300. These results indicate that p300 acetylates histones in mononucleosomes 
to the hyperacetylated state by targeting multiple lysine residues. 

25 

p300 acetylates the four lysines in the histone H4 N-terminal tail in vitro which are 
acetylated in vivo 

Lysines at positions 5, 8, 12 and 16 of histone H4 are acetylated in vivo 
(51). Recent studies with yeast histone acetyltransferases demonstrate the 
30 position-specific acetylation by distinct acetyltransferases, i.e., while cytoplasmic 
acetyltransferases for histone deposition and chromatin assembly modify 
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positions 5 and 12, GCN5 modifies positions 8 and 16 (52). Accordingly, the positions 
of acetylation by p300 were also determined. A series of synthetic peptides containing 
acetylated lysines at various positions was used to determine the acetylation 
site-specificity of p300. Consistent with the two-dimensional gel electrophoresis 
5 analysis, the experiments with peptide substrates showed that p300 acetylates all four 
lysines in the histone H4 that are acetylated in vivo. These results are consistent with the 
view that deposition-related diacetylated histones are deacetylated during maturation 
of chromatin (53). 

10 p300 preferentially acetylates the N-terminal histone tail 

Histone acetyltransferases modify specific lysine residues in the N-terminal 
tail of core histones but not the C-terminal globular domain in vivo (26,27,50,5 1). 
Structural models of nucleosomes (54,55,56) suggest that most of the lysine residues in 
the C-terminal globular domain are buried. Therefore, experiments were conducted to 

1 5 examine whether restricted acetylation of the N-terminal tail resulted from the substrate 
specificity of the enzyme or inaccessibility of the enzyme to the core domain in 
nucleosomes. The globular domains of all core histones contain a long helix flanked on 
either side by a loop segment and short helix, termed the "histone fold" (54,55,56). 
The histone fold is involved in formation of the stable H2A-H2B and H3-H4 

20 hetero-dimers, consisting of extensive hydrophobic contacts between the paired 

molecules. Therefore, it is likely that a histone monomer cannot fold properly, thereby 
increasing access of the histone acetyltransferase to the core domain. Based on these 
considerations, experiments were conducted to determine whether p300 acetylates free 
histone H4 in a N-terminal-specific manner. 

25 

Histone H4 was acetylated with p300 and subsequently the histone tail was 
removed by partial digestion with trypsin. The distributions of radioactivity between 
intact and core histones were compared. While the globular core histone domain was 
predominant at the higher trypsin concentrations, radioactivity was detected mostly in 
30 the intact histone. These data demonstrate that p300 preferentially acetylates the 
N-terminal tail of histone H4. 
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5 m. P/CAF interaction with MyoD 

Tissue culture and transfection experiments 

C 2 C l2 mouse cells ( ATCC accession number CRL 1 772) were grown in 
Dulbecco's modified Eagle medium (DMEM) supplemented with 20% fetal bovine 

10 serum (FBS) until they reached confluence. Differentiation was induced by switching 
medium to differentiation medium (DM), consisting of DMEM containing 2% horse 
serum. C 3 Hyi0Tl/2 fibroblasts (ATCC accession number CCL 226) were grown in 
DMEM supplemented with 10% FBS. Cells were transfected by the calcium phosphate 
precipitation method. Total amounts of transfected DNA were equalized by empty 

15 vector DNA. After 12 h incubation in medium containing the precipitated DNA, the 
cells were washed and incubated in fresh DMEM containing 10% FBS for an additional 
24 h. Afterwards, differentiation was induced by incubating in DM for 36 to 72 h. 
Chloramphenicol acetyltransferase (CAT) assays were performed as previously 
described (64,69). The quantities of cell extracts used for CAT assays were normalized 

20 top-galactosidase activity by cotransfection of 1 mg of the (3-galactosidase expression 
vector, pON260. 

Expression vectors used for transfection experiments are as follows: 
pCX-P/CAF for P/CAF (43); pCMV-bp300 for p300 (65), P CMV-p300 (1869-2414) 
25 (64) and pCMV-p300 (1514-1922) (60) for p300 wild type and mutants; pEl A12S, 
pElA12S R2G, pEl A12S D2-36 and pEl A12S D121-130 for El A wild type and 
mutants (66,67,68); and pEMSV-MyoD for MyoD (64). 



30 



The antisense P/CAF RNA expression vector, pcDNA3 P/CAF-AS, was created 
as follows. The 2.5 Kb EcoRI-Kpnl fragment containing the entire P/CAF open reading 
frame was isolated from pCX-P/CAF (43). This fragment was subcloned into the 
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EcoRI-Kpnl sites of plasmid pcDNA3 (Invitrogen) so that the antisense P/CAF RNA is 
driven under the CMV promoter. Reporter genes employed were 4RE-C AT and 
MCK-CAT (69). 4RE-CAT is driven by a synthetic promoter containing 4 copies of the 
E-box, whereas MCK-CAT is driven by the native MCK promoter (nucleotides -1256 to 
5 +7). 

Microinjection and immunofluorescence 

Cells were grown on small glass slides, subdivided into numbered squares of 2 
mm x 2 mm and microinjected with purified and concentrated antibodies, as previously 

10 described (70). For immunofluorescence, cells were fixed in either 2% 

paraformaldehyde or 1:2 methanol/acetone solution, preincubated with 5% BS A/PBS 
and incubated with the primary antibodies for 30 min at 37° C Subsequently, antibody 
was visualized by incubating with either rhodamine- or fluorescein-conjugated secondary 
antibody for 30 min at 37° C. Injected antibodies were stained with a 

15 rhodamine-conjugated secondary antibody and nuclei were counter-stained by DAPI as 
previously described (69). 

Antibodies employed are as follows; rabbit polyclonal affinity purified^ 
anti-P/CAF antibody (43), rabbit polyclonal anti-p300/CBP antiserum (71), mouse 
20 monoclonal anti-MyoD antibody (clone 5. 8 A, kindly provided by P. Houghton), goat 
polyclonal anti-c-Jun affinity purified antibody (Santa Cruz) and rabbit pre-immune 
serum. 

25 

Immunoprecipitation and DNA affinity purification 

Cells were resuspended in lysis buffer (20 mM NaP0 4 , 150 mM NaCl, 5mM 
MgCl 2 , 0.1% NP40, 1 mM DTT, 10 mM sodium fluoride, 0.1 mM sodium vanadate, 1 
mM phenylmethylsulfonyl-fluoride and 10 mg/ml each of leupeptin, aprotinin and 
30 pepstatin). After 30 min incubation on ice, samples were centrifuged at 12,000 x g for 
30 min and supernatants were used as cell extracts. Extracts were pre-cleared by 
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incubating with rabbit pre-immune serum and protein A/G Plus- Agarose (Santa Cruz) 
for 2 h at 4 C. For immunoprecipitation, the supernatants were incubated with the 
respective antibodies for 3 h at 4 C. Protein A/G Plus- Agarose was added, and 
incubation continued for 3 h. The matrix was washed with lysis buffer, then boiled in 2 
5 X SDS sample buffer. Immunoblotting was performed by using the ECL 

chemiluminescent detection kit (Amersham) according to the manufacturer's protocol. 

Affinity purification of E-box-bound complexes was done as previously 
described (69). Briefly, 100 ng of the biotinylated double stranded DNA containing the 
10 E-box were immobilized on streptavidin-conjugated magnetic beads and incubated with 
500 mg of cell extracts in the presence of poly dl-dC. After extensive washing, bound 
proteins were eluted with SDS sample buffer and analyzed by immunoblotting. 

In vitro protein-protein interaction assays 

1 5 The CBP-B fragment and its deletion derivatives were expressed as 

GST-fusions described previously (43). MyoD and El A (43) were expressed as 
FLAG-fusion proteins in Sf9 cells via a baculovirus expression system and 
affinity-purified on M2 anti-FLAG antibody-agarose (Kodak-IBI). Crude E. coli 
extracts containing GST-fusions were incubated with various amounts of MyoD and/or 

20 El A in 50 ml of buffer B (20 mM Tris-HCl, pH 8.0, 0, 1 M KC1, 5 mM MgCl 2 , 10% 
glycerol, and 0. 1% Nonidet P-40) on ice for 10 min. GST-precipitation was performed 
as described (43). MyoD and El A were detected by immunoblotting with anti-FLAG 
M2 antibody. For the interaction between P/CAF and MyoD, 1.5 pmol of 
FLAG-P/CAF and 15 pmol of FLAG-MyoD were incubated in 50 ml of buffer B on ice 

25 for 10 min. The mixture was further incubated with 2 mg of anti-P/CAF (43) or 
anti-hADA2 antibody for 60 min. The immunocomplexes were precipitated by 
incubation with 10 ml of protein A-Trisacryl (Pierce) and rotated for 1-4 hr at 4oC. The 
matrix was washed 4 times with 200 ml of buffer B and boiled in 10 ml of 2 X SDS 
sample buffer. The proteins were resolved on a 4%-20% gradient SDS-PAGE and 

30 subjected to immunoblotting with the anti-FLAG M2 antibody. The blot was developed 
with the SUPERSIGNAL chemiluminescent substrates (Pierce). 
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P/CAF coactivates muscle-specific transcription 

P/CAF and MyoD were co-transfected into mouse C3H10T 1/2 fibroblasts, and 
MyoD-mediated transcription was determined from reporter activity driven by the 
5 artificial (4RE) and the naturally-occurring muscle creatine kinase (MCK) promoters. 
Overexpression of P/CAF stimulated MyoD-dependent transcription several folds in 
both promoters. Similar results were obtained for the myoD activated myogenin 
promoter Transcriptional activation was further stimulated by co-transfecting with 
MyoD, P/CAF and p300 expression vectors, suggesting that P/CAF may function by 

10 forming a complex with p300/CBP. Consistent with the lack of DNA binding capacity in 
P/CAF, overexpression of P/CAF alone did not increase the basal transcriptional activity 
of either enhancer. To test whether P/CAF and p300/CBP function in the same pathway, 
two dominant negative forms of p300 were employed which specifically inhibit 
p300/CBP-mediated transcription (60,64). The p300 segment spanning residues 

15 1514-1922 inhibits the MyoD-dependent activation via direct interaction with MyoD 
(60), whereas the p300 segment spanning residues 1869-2414 inhibit it without direct 
interaction (64). Both dominant negative mutants inhibited MyoD-coactivation by 
P/CAF), suggesting that P/CAF and p300/CBP function in the same pathway; 

20 For further elucidation of the activation mechanism by P/CAF, the effect of El A, 

which inhibits MyoD-dependent transcription and differentiation (66,72,73) via direct 
interaction with p300/CBP (65,78), was tested. Expression of El A in C3H10T1/2 
fibroblasts inhibited stimulation of MyoD-directed transcription by P/CAF 
overexpression. El A mutants lacking p300/CBP-binding activity, El A D2-36 and El A 

25 R2G (67,79), had almost no effect. On the other hand, an El A mutant retaining 
p300/CBP-binding activity, El A D121-130, behaved like the wild type. Since El A 
associates with p300/CBP, but not with P/CAF, these results suggest that P/CAF 
functions in MyoD-directed transcription via interaction with p300/CBP. 

30 To address the role of P/CAF as a myogenic coactivator in a more relevant 

environment, P/CAF was overexpressed in proliferating C2C12 myoblasts which express 
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endogenous myogenic bHLH factors. As observed in fibroblasts, overexpression of 
P/CAF stimulated muscle specific transcription. Concomitant expression of exogenous 
p300 increased P/CAF-mediated coactivation. The repression exerted by wild type El A, 
but not mutant El A D2-36, on P/CAF coactivation of MyoD was also observed in 
5 muscle cells. 

Similar experiments were performed with myogenic cell lines that were stably 
transformed with wild type or mutant El A-expressing vectors (66). Coactivation by 
P/CAF was inhibited by wild type El A or the El A mutant that retains 
10 p300/CBP-binding activity (E1AA121-130). In contrast, E1A mutants that lack 

p300/CBP-binding (El A A2-36 and El A R2G) allowed transcriptional coactivation by 
P/CAF. Taken together, these experiments show that P/CAF coactivates MyoD-directed 
transcription via interaction with p300/CBP. 

1 5 P/CAF stimulates myogenic differentiation 

Given that P/CAF potentiates MyoD-directed transcription, the ability of P/CAF 
to assist MyoD in promoting myogenic differentiation was investigated. To this aim, 
C3H10T1/2 fibroblasts were transiently transfected with P/CAF and MyoD expression 
vectors. An expression vector for the green fluorescent protein (GFP) was 

20 co-transfected to identify transfected cells. After incubation in differentiation medium, 
the myogenic conversion of transfected cells was determined by simultaneous expression 
of the GFP and the differentiation-specific marker myosin heavy chain (MHC). Forced 
expression of MyoD in fibroblasts caused muscle differentiation in 12% of the 
transfected fibroblasts. This myogenic conversion was 20% by co-expressing MyoD and 

25 P/CAF. As observed in transcription experiments, stimulation of differentiation by 
P/CAF was counteracted by co-transfection with the p300 dominant negative mutant, 
p300 (1869-2414). Consistent with a general role for coactivators, overexpression of 
P/CAF alone was unable to differentiate fibroblasts. 

30 Similar experiments were done using proliferating C2C12 myoblasts in which the 

differentiation program is already committed. Most of the myoblasts differentiated into 
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myotubes by overexpressing P/CAF, whereas only a modest effect was observed by 
overexpressing p300. In contrast, differentiation was inhibited slightly by overexpressing 
c-Jun. This inhibitory effect presumably was caused by titration of p300/CBP, which 
associates directly with c-Jun (74). A similar inhibition was observed in the p300 
5 dominant negative mutant. Consistent with the transcriptional effect, El A almost 

completely inhibited differentiation. The El A mutant RG2, lacking p300/CBP-binding 
capability but retaining the retinoblastoma protein (Rb)-binding capability, only partially 
inhibited differentiation, although this same mutant 

inhibited transcription as severely as the wild type. Taken together, these data show that 
10 P/CAF stimulates muscle differentiation by coactivating MyoD function via p300/CBP 
association. 

P/CAF is essential for myogenic transcription and differentiation 

To test the necessity of P/CAF for myogenic transcription, experiments were 
1 5 conducted whereby P/CAF synthesis was inhibited by expressing antisense P/CAF RNA. 
A vector from which the P/CAF mRNA is transcribed in the antisense orientation 
(P/CAF- AS) was transfected with P/CAF and MyoD expression vectors into fibroblasts 
and MyoD-dependent transcription was examined. Cotransfection of the antisense 
expression vector strongly inhibited MyoD-dependent transcription below the level of 
20 induction elucidated by MyoD alone, demonstrating that expression of P/CAF antisense 
RNA inhibits not only the coactivation exerted by exogenous P/CAF but also that of 
endogenous P/CAF. These results indicate that P/CAF is essential for MyoD-dependent 
transcription. 

25 Studies were also carried out to determine whether expression of P/CAF 

antisense RNA inhibits myogenic differentiation. C3H10T1/2 fibroblasts were transiently 
transfected with various expression vectors with or without the P/CAF antisense RNA 
expression vector. Expression of P/CAF antisense RNA reduced MyoD-mediated 
myogenic conversion of fibroblasts. Expression of P/CAF antisense RNA also 

30 counteracted the stimulatory effect of both P/CAF and p300 on myogenic 

differentiation. These data support the view that P/CAF and p300/CBP coactivate 
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MyoD-dependent transcription in the same pathway. More drastic inhibition was 
observed in C2C12 myoblasts in similar experiments. Therefore, it can be concluded that 
P/CAF is essential for transcription of muscle specific genes and hence differentiation 
into myotubes. 

5 

To further confirm the essential role of P/CAF for myogenic differentiation, we 
blockage experiments by antibody microinjection were performed. Antibodies were 
injected into the cytoplasm of proliferating C2C12 myoblasts to prevent the nuclear 
transport of newly synthesized target proteins. After incubating in the differentiation 

10 medium, the degree of differentiation was determined. Microinjection of an anti-P/CAF 
antibody almost completely inhibited differentiation. Similar results were obtained by 
microinjecting anti-p300/CBP antibodies. Although microinjection of either 
anti-p300/CBP or P/CAF antibody was sufficient to inhibit differentiation, an even 
greater inhibition was observed by coinjecting both of them. Microinjection of 

15 anti-P/CAF or anti-p300/CBP antibody did not interfere with induction of p53 by DNA 
damaging agents, showing specificity of the inhibition by the antibodies. In contrast to 
anti-P/CAF or anti-p300/CBP antibodies, the injection of anti-MyoD antibody only 
partially inhibited differentiation, supporting the view of functional redundancy between 
MyoD and Myf-5 (75,76). Injection of anti-c-Jun antibody or control antibody did not 

20 interfere with muscle differentiation. 

Similar experiments were performed with C3H10T1/2 fibroblasts stably 
expressing MyoD. In these cells, either anti-p300/CBP or anti-P/CAF antibody 
completely inhibited muscle differentiation. In contrast to myoblasts, anti-MyoD 
25 antibody completely blocked differentiation in the fibroblasts expressing MyoD. 

Anti-c-Jun and control antibodies did not interfere with differentiation. Taken together, 
these results demonstrate that P/CAF and p300/CBP are indispensable for activation of 
the myogenic program. 
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p300/CBP, P/CAF and MyoD form a multimeric complex in vivo 

The data described above indicate that P/CAF stimulates MyoD-directed 
transcription via association with p300/CBP. Thus, experiments were conducted to 
investigate whether P/CAF, p300/CBP and MyoD could associate in a complex. 
5 First, cellular extracts derived from C2C12 myotubes were subjected to 

immunoprecipitation. Both anti-MyoD and anti-p300/CBP antibodies co-precipitated 
P/CAF. In a complementary experiment, both anti-p300/CBP and anti-P/CAF 
antibodies also co-precipitated MyoD, suggesting that these factors form a multimeric 
protein complex in myotubes. 

10 

Next, attempts were made to detect this complex on the E-box, the DNA 
binding site for MyoD. Immobilized DNA containing an E-box sequence was incubated 
with myotube extracts. After extensive washing, P/CAF, p300/CBP and MyoD were 
analyzed by immunoblotting. P/CAF, p300/CBP and MyoD were all affinity purified on 
1 5 the immobilized DNA, whereas they were not purified on the control DNA lacking the 
E-box. Given that P/CAF and p300/CBP per se cannot bind to DNA, these observations 
indicate that P/CAF and p300/CBP are recruited through MyoD at the E-box sites to 
form a multi-protein complex. 

20 Complex formation is inhibited by viral transforming factors 

Since the oncoviral proteins El A and large T antigen inhibit myogenic 
transcription and differentiation, the effect of these factors on the formation of 
complexes on the E-box was tested. Importantly, very small amouts of P/CAF and 
p300/CBP were co-purified on the E-box from myocyte extracts which stably express 
25 El A or large T antigen, although MyoD was detected under these conditions. The lower 
recovery of MyoD from El A-expressing muscle cells could reflect the low level of 
MyoD in the extracts (66). These results indicate that El A and large T antigen 
dissociate P/CAF and p300/CBP from MyoD without altering MyoD binding to DNA. 

30 Consistent with the previous observations that transiently expressed El A 

prevents interaction between P/CAF and p300/CBP in vivo (43), the association 
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between p300/CBP and P/CAF was abolished in myoblasts stably transformed by wild 
type El A but not in those clones transformed with the El A mutant R2G unable to bind 
p300/CBP. Similarly, the interaction between p300/CBP and P/CAF was abolished by 
large T antigen but not by the mutant protein that localizes into the cytoplasm (77). 

5 

Interaction between MyoD, P/CAF and CBP in vitro 

Previous interaction experiments in vitro indicate that the CBP region spanning 
residues 1801 to 1850 is crucial for interaction with both P/CAF and El A (43). While 
most sequence-specific factors bind to CBP sites distinct from the P/CAF/E1 A binding 

10 sites, MyoD interacts with an overlapping CBP fragment called the CH3 region 

(60,64,65). To understand how P/CAF, p300/CBP and MyoD associate, the CBP sites 
important for MyoD binding were mapped more precisely. Consistent with previous 
reports (60,64,65), the CBP fragment spanning residues 1801-2000 (fragment B) bound 
MyoD. Moreover, deletion of residues 1801 to 1850 within fragment B completely 

1 5 abolished interaction with MyoD, which is similar to the results obtained with P/CAF 
and El A Importantly, an internal deletion of residues 1850-1878 abolished the MyoD 
interaction with CBP, while it did not affect binding of El A or P/CAF (43). These 
results suggest that MyoD and P/CAF bind to distinct sites of p300/CBP, albeit the 
binding sites may overlap. Moreover, a direct interaction was observed between MyoD 

20 and P/CAF, which may contribute to stabilization of the multimeric complex. 

These data show that El A prevents not only p300/CBP-interaction with 
P/CAF but also that with MyoD in vivo. To obtain evidence that this 
inhibition is due to the direct action by El A, competition experiments were performed 
25 in vitro. Importantly, the interaction between CBP and MyoD was strongly inhibited by 
addition of E1A, implicating that El A inhibits myogenic transcription by disrupting 
multiple interactions. 

Although the present process has been described with reference to specific 
30 details of certain embodiments thereof, it is not intended that such details should be 
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regarded as limitations upon the scope of the invention except as and to the extent that 
they are included in the accompanying claims. 

Throughout this application various publications are referenced by numbers 
5 within parentheses. Full citations for these publications are as follows. The disclosures 
of these publications in their entireties are hereby incorporated by reference into this 
application in order to more fully describe the state of the art to which this invention 
pertains. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT: The United States of America, as repesented by the 
Secretary, Department of Health and Human Services, c/o 
National Institutes of Health, Office of Technology Transfer, 
6011 Executive Boulevard, Suite 325, Rockville, Maryland 20842 

(ii) TITLE OF THE INVENTION: METHODS AND COMPOSITIONS FOR 

p300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF 

(iii) NUMBER OF SEQUENCES : 18 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NEEDLE & ROSENBERG, P.C. 

(B) STREET: Suite 1200, 127 Peachtree Street, NE 

(C) CITY: Atlanta 

(D) STATE: GA 

(E) COUNTRY: USA 

(F) ZIP : 30303 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 23-JUL-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: Corresponding U.S. Serial No. 60/022,273 

(B) FILING DATE: 23-July-1996 



(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Miller, Mary L 

(B) REGISTRATION NUMBER: 39,303 

(C) REFERENCE /DOCKET NUMBER: 14014. 0238/P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404/688-0770 

(B) TELEFAX: 404/688-9880 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 832 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



Met 


S e it 


Glu 


Ala 


Glv 
j 


Glv 


Ala 


Glv 


i 








5 








Gly 


Ala 


Glv 


Ala 


Glv 


Pro 


Glv 


Ala 








20 












civ 


Ala 


Pro 


Pro 


Gin 


Gly 


Ser 






35 










40 


Ser 


Gly 


Ala 


Cys 


Glv 


Pro 


Ala 


Thr 




50 










55 




Glu 


Gly 


Pro 


Glv 


Glv 


Glv 


Glv 


Ser 


65 










70 






Gin 


Leu 


Arg 


Ser 


Ala 


Pro 


Ara 


Ala 










85 








Tvr 
y 


Ser 


Ala 


Cvs 


LVS 


Ala 


Glu 


Glu 








100 










As n 


Pro 


Asn 


Pro 


Ser 


Pro 


Thr 


Pro 






115 










120 


lie 


Val 


Ser 


Leu 


Thr 


Glu 


Ser 


Cvs 




130 










135 




Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


14 5 










150 






Leu 


Leu 


Gl y 


He 


Val 


Leu 


Asp 


Val 










165 








Lys 


Glu 


Glu 


Asp 


Ala 


Asp 


Thr 


Lys 








18 0 










Leu 


Leu 


Arg 


Lys 


Ser 


lie 


Leu 


Gin 






195 










2 00 


S e it 


Leu 


Glu 


Ly s 


Lys 


Pro 


Pro 


Phe 




J. VJ 










215 




v ai. 




As n 


Phe 


Val 


Gin 




Lys 


Z. <£. «J 










23 0 






y 


Gin 


Th r 


He 


Val 


Glu 


Leu 


Ala 








2 4 5 








1 y L 




His 


Leu 


Glu 


Ala 


Pro 


Ser 








2 60 












Asp 


He 


Ser 


Gly 


Tv r 


Lys 


Glu 






27 5 










280 


Cy s 


As n 


Val 


Pro 


Gin 


Phe 


Cys 


Asp 




Z? \J 










2 95 






v a j. 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


305 










310 






.r\.±- y 


Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 










325 








Glu 


Ly s 


Arg 


Thr 


Leu 


He 


Leu 


Thr 








340 










Leu 


Glu 


Glu 


Glu 


Val 


Tvr 
y 


Ser 


Gin 






355 










360 


Phe 


Leu 


Ser 


Ala 


Ser 


Ser 


Arg 


Thr 




370 










37 5 




Tip 
J. x c 




Pro 


Pro 


P ro 


Val 


Ala 


Gly 


JO J 










O _7 \J 






Ser 


Ser 


Leu 


Glu 


Gin 


Pro 


Asn 


Ala 










405 








Ala 


Ser 


Ser 


Gly 


Leu 


Glu 


Ala 


Asn 








420 










Asp 


Ser 


His 


Val 


Leu 


Glu 


Glu 


Ala 






435 










440 


He 


Pro 


Met 


■ Glu 


Leu 


He 


Asn 


Glu 




450 










455 




Ala 


Ala 


Met 


Leu 


Gly 


Pro 


Glu 


Thr 


465 










470 







Pro 


Glv 


Glv 


Cvs 


Glv 


Ala 


Glv 
y 


Ala 




10 










15 




Leu 


Pro 


Pro 


Gin 


Pro 


Ala 


Ala 


Leu 


25 










30 






Pro 


Cvs 


Ala 


Ala 


Ala 


Ala 


Glv 
y 


Glv 










45 








Ala 


Val 


Ala 


Ala 


Ala 


Glv 


Thr 


Ala 








60 










Ala 


Arg 


He 


Ala 


Val 


LVS 


Lvs 

y 


Ala 






75 










80 


Lvs 
y 


Lvs 
y 


Leu 


Glu 


Lvs 
y 


Leu 


Glv 


Val 




90 










95 




Ser 


Cvs 


Lvs 


Cvs 
y 


Asn 


Glv 
y 


Tro 


Lvs 


105 










110 






Pro 


Ara 


Ala 


Asd 


Leu 


Gin 


Gin 


He 










125 








Arg 


Ser 


Cys 


Ser 


His 


Ala 


Leu 


Ala 








140 










Val 


Ser 


Glu 


Glu 


Glu 


Met 


Asn 


Arg 






155 










160 


Glu 


Tvr 
y 


Leu 


Phe 


Thr 


Cys 


Val 


His 




170 










175 




Gin 


Val 


Tvr 


Phe 


Tvr 
y 


Leu 


Phe 


Lvs 
y 


185 










190 






Arg 


Glv 

wo. y 


Lys 


Pro 


Val 


Val 


Glu 


Glv 










205 








Glu 


Lys 


Pro 


Ser 


He 


Glu 


Gin 


Gly 








220 










Phe 


Ser 


His 


Leu 


Pro 


Ala 


Lys 


Glu 






235 










240 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 


He 


Asn 




25 0 










255 




Gin 


Arg 


Arg 


Leu 


Arg 


Ser 


P ro 


As n 


265 










270 






Asn 


Tvr 
x j ■ u 


Thr 


Ara 


Tro 


Leu 


Cvs 
y 


Tvr 

y 










285 








Ser 


Leu 


Pro 


Arg 


Tvr 


Glu 


Thr 


Thr 








300 










Arg 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 






315 










320 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 




330 










335 




His 


Phe 


Pro 


Lvs 


Phe 


Leu 


Ser 


Met 


345 










350 






Asn 


Ser 


Pre 


He 


Tr-D 


Asp 


Gin 


Asd 










365 








Ser 


Gin 


Leu 


Glv 


He 


Gin 


Thr 


Val 








380 










Thr 


He 


Ser 


Tt/r 

i yr 


As n 


Ser 


Thr 


Ser 






395 










400 


Gly 


Ser 


Ser 


Ser 


Pro 


Ala 


Cys 


Lys 




410 










415 




Pro 


Gly 


Glu 


Lys 


Arg 


Lys 


Met 


Thr 


425 










430 






Lys 


Lys 


Pro 


Arg 


Val 


Met 


Gly 


Asp 










445 








Val 


Met 


Ser 


Thr 


He 


Thr 


Asp 


Pro 








460 










Asn 


Phe 


Leu 


Ser 


Ala 


His 


Ser 


Ala 



475 480 
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Arg 


Asp 


Glu 


Ala 


Ala 


Ara 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


val 


He 


Glu 


Phe 






485 










490 










495 




His 


Val 


Val 


Glv 
500 


Asn 


Ser 


Leu 


Asn 


Gin 
505 


Lys 


Pro 


Asn 


Lys 


Lys 
510 


He 


Leu 




T rn 


Leu 


Val 


Glv 


Leu 


Gin' 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 




515 










520 










525 








Met 


Pro 


Lys 


Glu 


Tvr 


He 


Thr 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


Lys 




53 0 








535 










540 










Th r 


Leu 


Ala 


Leu 


He 


LVS 


Asp 


Gly 


Arg 


Val 


He 


Gly 


Gly 


He 


Cys 


Phe 


545 










550 










555 










560 




Met 


Phe 


Pro 


Ser 


Gin 


GlV 


Phe 


Thr 


Glu 


He 


Val 


Phe 


Cys 


Ala 


Val 








565 










570 










575 




X 11 X. 


Ser 


Asn 


Glu 


Gin 


Val 


LVS 


Glv 


Tyr 


Gly 


Thr 


His 


Leu 


Met 


Asn 


His 








58 0 








585 










590 






Leu 


Lys 


Glu 


Tyr 


His 


He 


Lys 


His 


Asp 


He 


Leu 


Asn 


Phe 


Leu 


Thr 


Tyr 




595 








600 










605 








Ala 


Asp 


Glu 


Tyr 


Ala 


He 


Gly 


Tyr 


Phe 


Lys 


Lys 


Gin 


Gly 


Phe 


Ser 


Lys 




610 








615 










620 










Glu 


He 


Lys 


He 


Pro 


Lys 


Thr 


Lys 


Tyr 


Val 


Gly 


Tyr 


He 


Lys 


Asp 


Tyr 


62 5 








630 










635 










640 


Glu 


Gl V 


Ala 


Thr 


Leu 


Met 


Gly 


Cys 


Glu 


Leu 


Asn 


Pro 


Arg 


He 


Pro 


Tyr 








645 










650 










655 




Th r 


Glu 


Phe 


Ser 
660 


Val 


He 


He 


Lvs 


Lys 
665 


Gin 


Lys 


Glu 


•He 


He 
670 


Lys 


Lys 


Leu 


He 


Glu 


Arg 


Lys 


Gin 


Ala 


Gin 


lie 


Arg 


Lys 


Val 


Tyr 


Pro 


Gly 


Leu 






67 5 






680 










685 








Ser 


Cys 
690 


Phe 


Lys 


Asp 


Glv 


Val 
695 


Arg 


Gin 


He 


Pro 


He 
700 


Glu 


Ser 


lie 


Pro 


Gly 


He 


Arg 


Glu 


Thr 


GlV 


Trp 


Lys 


Pro 


Ser 


Gly 


Lys 


Glu 


Lys 


Ser 


Lys 


7 n s 








710 










715 










720 


Glu 


Pro 


Arg 


Asp 


Pro 


Asp 


Gin 


Leu 


Tyr 


Ser 


Thr 


Leu 


Lys 


Ser 


lie 


Leu 






725 










730 










735 






Gin 


Val 


Lys 
7 4 0 


Ser 


His 


Gin 


Ser 


Ala 
745 


Trp 


Pro 


Phe 


Met 


Glu 
750 


Pro 


Val 


Lys 


Arg 


Th r 


Glu 


Ala 


Pro 


Glv 

vj j_ y 


Tvr 


Tvr 


Glu 


Val 


He 


Arg 


Ser 


Pro 


Met 


7 55 










n fin 










7 65 








Asp 


Leu 


Lys 


Thr 


Met 


Ser 


Glu 


Arg 


Leu 


Lys 


Asn 


Arg 


Tyr 


Tyr 


Val 


Ser 


77 0 










775 










780 










Lys 


Lys 


Leu 


Phe 


Met 


Ala 


Asp 


Leu 


Gin 


Arg 


Val 


Phe 


Thr 


Asn 


Cys 


Lys 


785 








790 










795 










800 


Glu 


Tyr 


Asn 


Ala 


Pro 


GlU 


Ser 


Glu 


Tyr 


Tyr 


Lys 


Cys 


Ala 


Asn 


He 


Leu 








805 










810 










815 




Glu 


Lys 


Phe 


Phe 


Phe 


Ser 


Lys 


He 


Lys 


Glu 


Ala 


Gly 


Leu 


He 


Asp 


Lys 






820 










825 










830 







(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Leu Glu Glu Glu Val Tyr Ser Gin Asn Ser Pro He Trp Asp Gin 

1 5 10 15 . 

Asp Phe Leu Ser Ala Ser Ser Arg Thr Ser Gin Leu Gly He Gin Thr 

20 25 30 

Val He Asn Pro Pro Pro Val Ala Gly Thr lie Ser Tyr Asn Ser Thr 
35 40 45 
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Ser 


Ser 


Ser 


Leu 


Glu 


Gin 


Pro 


Asn 


Ala 


Glv 


Ser 


Ser 


Ser 


Pro 


Ala 


Cys 




50 










55 










60 








Lvs 


Ala 


Ser 


Ser 


Gly 


Leu 


Glu 


Ala 


Asn 


Pro 


Gly 


Glu 


Lys 


Arg 


Lvs 


Met 


65 










70 










75 










80 


Thr 


Asp 


Ser 


His 


Val 


Leu 


Glu 


Glu 


Ala 


Lys 


Lys 


Pro 


Ara 


Val 


Met 


Glv 










85 










90 










95 




Asp 


lie 


Pro 


Met 


Glu 


Leu 


He 


Asn 


Glu 


Val 


Met 


Ser 


Thr 


He 


Thr 


Asp 

IT 








100 










105 










110 






Pro 


Ala 


Ala 


Met 


Leu 


Gly 


Pro 


Glu 


Thr 


Asn 


Phe 


Leu 


Ser 


Ala 


His 


Ser 






115 










120 










125 








Ala 


Arg 


Asp 


Glu 


Ala 


Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


Val 


He 


GlU 




130 










135 










140 










Phe 


His 


Val 


Val 


Gly 


Asn 


Ser 


Leu 


Asn 


Gin 


Lvs 


Pro 


Asn 


Lvs 


Lvs 


He 


145 










150 










155 










160 


Leu 


Met 


Trp 


Leu 


Val 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 










165 










170 










175 




Ara 


Met 


Pro 


Lvs 


Glu 


Tyr 


lie 


Thr 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lvs 


His 








180 










185 










190 






Lvs 


Thr 


Leu 


Ala 


Leu 


He 


Lys 


Asp 


Gly 


Arg 


Val 


He 


Gly 


Gly 


He 


Cys 






195 










200 










205 








Phe 


Arg 


Met 


Phe 


Pro 


Ser 


Gin 


Gly 


Phe 


Thr 


Glu 


He 


Val 


Phe 


Cys 


Ala 




210 










215 










220 










Val 


Thr 


Ser 


Asn 


Glu 


Gin 


Val 


Lys 


Gly 


Tyr 


Gly 


Thr 


His 


Leu 


Met 


Asn 


225 










230 










235 










240 


His 


Leu 


Lvs 


Glu 


Tyr 


His 


lie 


Lvs 


His 


Asp 


lie 


Leu 


Asn 


Phe 


Leu 


Thr 










245 










250 










255 




Tvr 


Ala 


Asp 


Glu 


Tvr 


Ala 


He 


Gly 


Tvr 


Phe 


Lys 


Lys 


Gin 


Gly 


Phe 


Ser 








260 










265 










270 






Lvs 


Glu 


lie 


Lvs 


He 


Pro 


Lys 


Thr 


Lys 


Tyr 


Val 


Gly 


Tyr 


He 


Lys 


Asp 






275 










280 










285 








Tvr 


Glu 


Gly 


Ala 


Thr 


Leu 


Met 


Gly 


Cys 


Glu 


Leu 


Asn 


Pro 


Arg 


He 


Pro 


290 










295 










300 










Tvr 


Thr 


Glu 


Phe 


Ser 


Val 


He 


He 


Lys 


Lys 


Gin 


Lys 


Glu 


He 


He 


Lys 


305 










310 










315 










320 


Lys 


Leu 


lie 


Glu 


Arq 


Lvs 


Gin 


Ala 


Gin 


He 


Arg 


Lys 


Val 


Tyr 


Pro 


Gly 










325 










330 










335 




Leu 


Ser 


Cys 


Phe 


Lys 


Asp 


Gly 


Val 


Arg 


Gin 


He 


Pro 


He 


Glu 


Ser 


He 








340 










345 










350 






Pro 


Gly 


lie 


Arg 


Glu 


Thr 


Gly 


Trp 


Lys 


Pro 


Ser 


Gly 


Lys 


Glu 


Lys 


Ser 






355 










360 










365 








Lys 


Glu 


Pro 


Arg 


Asp 


Pro 


Asp 


Gin 


Leu 


Tvr 


Ser 


Thr 


Leu 


Lvs 


Ser 


He 




370 










375 










380 










Leu 


Gin 


Gin 


Val 


Lvs 


Ser 


His 


Gin 


Ser 


Ala 


Trp 


Pro 


Phe 


Met 


Glu 


Pro 


385 










390 










395 










400 


Val 


Lys 


Arg 


Thr 


Glu 


Ala 


Pro 


Gly 


Tyr 


Tyr 


Glu 


Val 


He 


Arg 


Ser 


Pro 










405 










410 










415 




Met 


Asp 


Leu 


Lys 


Thr 


Met 


Ser 


Glu 


Arg 


Leu 


Lys 


Asn 


Arg 


Tyr 


Tyr 


Val 








420 










425 










430 






Ser 


Lys 


Lys 


Leu 


Phe 


Met 


Ala 


Asp 


Leu 


Gin 


Arg 


Val 


Phe 


Thr 


Asn 


Cys 






435 










440 










445 








Lys 


Glu 


Tyr 


Asn 


Ala 


Pro 


Glu 


Ser 


Glu 


Tyr 


Tyr 


Lys 


Cys 


Ala 


Asn 


He 




450 










455 










460 










Leu 


Glu 


Lys 


Phe 


Phe 


Phe 


Ser 


Lys 


He 


Lys 


Glu 


Ala 


Gly 


Leu 


He 


Asp 


465 










470 










475 










480 


Lys 

































(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 203 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: None 



(xi)' SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



Ar g 


Val 


Val 


Gin 


His 


Thr 


Lys 


Glv 


Cvs 


Lvs 


Arq 


Lvs 


Thr 


Asn 


Gly 


Gly 


-L 








5 










10 










15 






P ro 


He 


Cys 


Ly s 


Gin 


Leu 


He 


Ala 


Leu 


Cys 


Cys 


Tyr 


His 


Ala 


Lys 






20 










25 










30 






tlX 5 


cys 


VJ7-L 1 1 


Glu 


Asn 


Lys 


Cys 


Pro 


Val 


Pro 


Phe 


Cvs 


Leu 


Asn 


He 


Lys 




35 










40 










45 










T \7 <=z 


JJCU 




Gin 


Gin 


Gin 


Leu 


Gin 


His 


Arg 


Leu 


Gin 


Gin 


Ala 


Gin 




o u 










55 










60 










Met 


Le li 


Arg 


Arg 


Arg 


Met 


Ala 


Ser 


Met 


Arg 


Thr 


Gly 


Val 


Val 


Gly 


Gin 












70 










75 










80 








Leu 


Pro 


Ser 


Pro 


Thr 


Pro 


Ala 


Thr 


Pro 


Thr 


Thr 


Pro 


Thr 








85 










90 










95 




Gly 


Gin 


Gin 


Pro 


Thr 


Thr 


Pro 


Gin 


Thr 


Pro 


Gin 


Pro 


Thr 


Ser 


Gin 


Pro 






100 










105 










110 






Gin 


Pro 


Thr 


Pro 


Pro 


Asn 


Ser 


Met 


Pro 


Pro 


Tyr 


Leu 


Pro 


Arg 


Thr 


Gin 






115 










120 










125 








Ala 


Ala 


Gly 


Pro 


Val 


Ser 


Gin 


Gly 


Lys 


Ala 


Ala 


Gly 


Gin 


Val 


Thr 


Pro 




130 








135 










140 










Pro 


Thr 


Pro 


Pro 


Gin 


Thr 


Ala 


Gin 


Pro 


Pro 


Leu 


Pro 


Gly 


Pro 


. Pro 


Pro 


145 










150 










155 










160 


Thr 


Ala 


Val 


Glu 


Met 


Ala 


Met 


Gin 


He 


Gin 


Arg 


Ala 


Ala 


Glu 


Thr 


Gin 










165 










170 










175 




Arg 


Gin 


Met 


Ala 


His 


Val 


Gin 


He 


Phe 


Gin 


Arg 


Pro 


lie 


Gin 


His 


Gin 






180 










185 










190 






Met 


Pro 


Pro 


Met 


Thr 


Pro 


Met 


Ala 


Pro 


Met 


Gly 
















195 










200 






















(2) 


) INFORMATION FOR SEQ ID 


NO: 


4: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



Met 


Ser 


Glu 


Ala 


Gly 


Gly 


Ala 


Gly 


Pro 


Gly 


Gly 


Cys 


Gly 


Ala 


Gly 


Ala 


1 








5 










10 










15 




Gly 


Ala 


Gly 


Ala 


Gly 


Pro 


Gly 


Ala 


Leu 


Pro 


Pro 


Gin 


Pro 


Ala 


Ala 


Leu 




20 










25 










30 






Pro 


Pro 


Ala 
35 


Pro 


Pro 


Gin 


Gly 


Ser 
40 


Pro 


Cys 


Ala 


Ala 


Ala 
45 


Ala 


Gly 


Gly 


Ser 


Gly 


Ala 


Cys 


Gly 


Pro 


Ala 


Thr 


Ala 


Val 


Ala 


Ala 


Ala 


Gly 


Thr 


Ala 




50 






55 










60 










Glu 


Gly 


Pro 


Gly 


Gly 


Gly 


Gly 


Ser 


Ala 


Arg 


He 


Ala 


Val 


Lys 


Lys 


Ala 


65 






70 










75 










80 


Gin 


Leu 


Arg 


Ser 


Ala 


Pro 


Arg 


Ala 


Lys 


Lys 


Leu 


Glu 


Lys 


Leu 


Gly 


Val 








85 










90 










95 




Tyr 


Ser 


Ala 


Cys 


Lys 


Ala 


Glu 


Glu 


Ser 


Cys 


Lys 


Cys 


Asn 


Gly 


Trp 


Lys 






100 










105 










110 






Asn 


Pro 


Asn 
115 


Pro 


Ser 


Pro 


Thr 


Pro 
120 


Pro 


Arg 


Ala 


Asp 


Leu 
125 


Gin 


Gin. 


He 


He 


Val 
130 


Ser 


Leu 


Thr 


Glu 


Ser 
135 


Cys 


Arg 


Ser 


Cys 


Ser 
140 


His 


Ala 


Leu 


Ala 
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Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 


Glu 


Glu 


Glu 


Met 


Asn 


Arg 


145 










150 










155 










160 


Leu 


Leu 


Gly 


He 


Val 


Leu 


Asp 


Val 


Glu 


Tyr 


Leu 


Phe 


Thr 


Cys 


Val 


His 










165 










170 










175 




Lys 


Glu 


Glu 


Asp 


Ala 


Asp 


Thr 


Lys 


Gin 


Val 


Tyr 


Phe 


Tyr 


Leu 


Phe 


Lys 






180 










185 










190 






Leu 


Leu 


Arg 


Lys 


Ser 


He 


Leu 


Gin 


Arg 


Gly 


Lys 


Pro 


Val 


Val 


Glu 


Gly 






195 










200 










205 








Ser 


Leu 


Glu 


Lys 


Lys 


Pro 


Pro 


Phe 


Glu 


Lys 


Pro 


Ser 


He 


Glu 


Gin 


Gly 




210 










215 










220 










Val 


Asn 


Asn 


Phe 


Val 


Gin 


Tyr 


Lys 


Phe 


Ser 


His 


Leu 


Pro 


Ala 


Lys 


Glu 


225 










230 










235 










240 


Arcj 


Gin 


Thr 


He 


Val 


Glu 


Leu 


Ala 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 


He 


Asn 








245 










250 










255 




Tyr 


Trp 


His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Arg 


Arg 


Leu 


Arg 


Ser 


Pro 


Asn 






260 










265 










270 






Asd 


Asp 


lie 


Ser 


Gly 


Tyr 


Lys 


Glu 


Asn 


Tyr 


Thr 


Arg 


Trp 


Leu 


Cys 


Tyr 




275 










280 










285 








Cys 


Asn 


Val 


Pro 


Gin 


Phe 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


290 










295 










300 










Gin 


Val 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 


305 










310 










315 










320 


Arg 


Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 








325 










330 










335 




Glu 


Lys 


Arg 


Thr 


Leu 


He 


Leu 


Thr 


His 


Phe 


Pro 


Lys 


Phe 


Leu 


Ser 








340 










345 










350 







(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



Met 


Leu 


Glu 


Glu 


Glu 


He 


Tyr 


Gly 


Ala 


Asn 


Ser 


Pro 


He 


Trp 


Glu 


Ser 


1 








5 










10 










15 




Gly 


Phe 


Thr 


Met 


Pro 


Pro 


Ser 


Glu 


Gly 


Thr 


Gin 


Leu 


Val 


Pro 


Arg 


Pro 






20 










25 










30 






Ala 


Ser. 


Val 
35 


Ser 


Ala 


Ala 


Val 


Val 
40 


Pro 


Ser 


Thr 


Pro 


He 
45 


Phe 


Ser 


Pro 


Ser 


Met 


Gly 


Gly 


Gly 


Ser 


Asn 


Ser 


Ser 


Leu 


Ser 


Leu 


Asp 


Ser 


Ala 


Gly 




50 






55 










60 










Ala 


Glu 


Pro 


Met 


Pro 


Gly 


Glu 


Lys 


Arg 


Thr 


Leu 


Pro 


Glu 


Asn 


Leu 


Thr 


65 










70 










75 










80 


Leu 


Glu 


Asp 


Ala 


Lys 
85 


Arg 


Leu 


Arg 


Val 


Met 
90 


Gly 


Asp 


He 


Pro 


Met 
95 


Glu 


Leu 


Val 


Asn 


Glu 
100 


Val 


Met 


Leu 


Thr 


He 
105 


Thr 


Asp 


Pro 


Ala 


Ala 
110 


Met 


Leu 


Gly 


Pro 


Glu 
115 


Thr 


Ser 


Leu 


Leu 


Ser 
120 


Ala 


Asn 


Ala 


Ala 


Arg 
125 


Asp 


Glu 


Thr 


Ala 


Arg 
130 


Leu 


Glu 


Glu 


Arg 


Arg 
135 


Gly 


He 


He 


Glu 


Phe 
140 


His 


Val 


He 


Gly 


Asn 


Ser 


Leu 


Thr 


Pro 


Lys 


Ala 


Asn 


Arg 


Arg 


Val 


Leu 


Leu 


Trp 


Leu 


Val 


145 










150 










155 










160 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 


Glu 








165 










170 










175 




Tyr 


He 


Ala 


Arg 
180 


Leu 


Val 


Phe 


Asp 


Pro 
185 


Lys 


His 


Lys 


Thr 


Leu 
190 


Ala 


Leu 



r 



WO 98/03652 PCT/US97/12877 



76 



lie 


Lvs 


Asp 


Gly 


Arg 


Val 


He 


Gly 


Gly 


He 


Cys 


Phe 


Arg 


Met 


Phe 


Pro 




195 










200 










205 








Thr 


Gin 


Gly 


Phe 


Thr 


Glu 


He 


Val 


Phe 


Cys 


Ala 


Val 


Thr 


Ser 


Asn 


Glu 




210 








215 










220 










Gin 


Val 


Lys 


Gly 


Tvr 


Gly 


Thr 


His 


Leu 


Met 


Asn 


His 


Leu 


Lys 


Glu 


Tyr 


225 








230 










235 










240 


His 


lie 


Lvs 


His 


Asn 


lie 


Leu 


Tyr 


Phe 


Leu 


Thr 


Tyr 


Ala 


Asp 


Glu 


Tyr 








245 










250 










255 




Ala 


lie 


Gl V 


Tvr 


Phe 


Lvs 


Lvs 


Gin 


Gly 


Phe 


Ser 


Lys 


Asp 


He 


Lys 


Val 






2 60 










265 










270 






Pro 


Lys 


Ser 


Arcr 


Tvr 


Leu 


Gly 


Tyr 


He 


Lys 


Asp 


Tyr 


Glu 


Gly 


Ala 


Thr 




275 










280 










285 








Leu 


Met 


Glu 


Cvs 


Glu 


Leu 


Asn 


Pro 


Arg 


He 


Pro 


Tyr 


Thr 


Glu 


Leu 


Ser 




290 








295 










300 










His 


lie 


lie 


Lvs 


Lys 


Gin 


Lys 


Glu 


He 


He 


Lys 


Lys 


Leu 


He 


Glu 


Arg 


305 






310 










315 










320 


Lys 


Gin 


Ala 


Gin 


lie 


Ara 


Lvs 


Val 


Tvr 


Pro 


Gly 


Leu 


Ser 


Cys 


Phe 


Lys 








325 










330 










335 




Glu 


Gly 


Val 


Arg 


Gin 


lie 


Pro 


Val 


Glu 


Ser 


Val 


Pro 


Gly 


He 


Arg 


Glu 






340 










345 










350 






Thr 


Gl V 


Tro 


Lvs 


Pro 


Leu 


Gly 


Lys 


Glu 


Lys 


Gly 


Lys 


Glu 


Leu 


Lys 


Asp 




355 










360 










365 








Pro 


Asp 


Gin 


Leu 


Tvr 


Thr 


Thr 


Leu 


Lys 


Asn 


Leu 


Leu 


Ala 


Gin 


lie 


Lys 




370 










375 










380 










Ser 


His 


Pro 


Ser 


Ala 


Trp 


Pro 


Phe 


Met 


Glu 


Pro 


Val 


Lys 


Lys 


Ser 


Glu 


385 










390 










395 










400 


Ala 


Pro 


Asp 


Tvr 


Tyr 


Glu 


Val 


He 


Arg 


Phe 


Pro 


He 


Asp 


Leu 


Lys 


Thr 






405 










410 










415 




Met 


Thr 


Glu 


Arg 


Leu 


Arg 


Ser 


Arg 


Tyr 


Tyr 


Val 


Thr 


Arg 


Lys 


Leu 


Phe 








420 










425 










430 






Val 


Ala 


Asp 


Leu 


Gin 


Arg 


Val 


He 


Ala 


Asn 


Cys 


Arg 


Glu 


Tyr 


Asn 


Pro 






435 










440 










445 








Pro 


Asp 


Ser 


Glu 


Tyr 


Cys 


Arg 


Cys 


Ala 


Ser 


Ala 


Leu 


Glu 


Lys 


Phe 


Phe 




450 










455 










460 










Tyr 


Phe 


Lys 


Leu 


Lys 


Glu 


Gly 


Gly 


Leu 


He 


Asp 


Lys 










465 










470 










475 













(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2414 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Met 


Ala 


Glu 


Asn 


Val 


Val 


Glu 


Pro 


Gly 


Pro 


Pro 


Ser 


Ala 


Lys 


Arg 


Pro 


1 








5 










10 










15 ■ 




Lys 


Leu 


Ser 


Ser 


Pro 


Ala 


Leu 


Ser 


Ala 


Ser 


Ala 


Ser 


Asp 


Gly 


Thr 


Asp 






20 










25 










30 






Phe 


Gly 


Ser 


Leu 


Phe 


Asp 


Leu 


Glu 


His 


Asp 


Leu 


Pro 


Asp 


Glu 


Leu 


He 




35 










40 










45 








Asn 


Ser 


Thr 


Glu 


Leu 


Gly 


Leu 


Thr 


Asn 


Gly 


Gly 


Asp 


He 


Asn 


Gin 


Leu 




50 










55 










60 










Gin 


Thr 


Ser 


Leu 


Gly 


Met 


Val 


Gin 


Asp 


Ala 


Ala 


Ser 


Lys 


His 


Lys 


Gin 


65 








70 










75 










80 


Leu 


Ser 


Glu 


Leu 


Leu 


Arg 


Ser 


Gly 


Ser 


Ser 


Pro 


Asn 


Leu 


Asn 


Met 


Gly 










85 








90 










95 




Val 


Gly 


Gly 


Pro 


Gly 


Gin 


Val 


Met 


Ala 


Ser 


Gin 


Ala 


Gin 


Gin 


Ser 


Ser 




100 










105 










110 







wo 
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Pro 


Gl v 


Leu 


Glv 






115 




Ala 


GlV 


Leu 


Thr 




130 






Gin 


Gly 


Pro 


Thr 


145 








Pro 


Ala 


Met 


Glv 


Met 


Leu 


Ala 


Ala 








180 


As n 


Gly 


Ser 


He 






195 




As n 


Pro 


Gly 


Met 




210 






Gin 


Glv 


Ser 


Pro 


225 








Pro 


Leu 


Lys 


Met 


Tyr 


Thr 


Gin 


Asn 








260 


Gin 


lie 


Gin 


Thr 






275 




Met 


Asp 


Lys 


Lys 




290 






Gin 


Pro 


Ala 


Pro 


305 








Gin 


Glv 


Met 


Gly 


Leu 


He 


Gin 


Gin 








340 


Arg 


Arg 


Glu 


Gin 






355 




Cys 


Arg 


Thr 


Met 




370 






Gly 


Ly s 


Ser 


Cys 


385 








Ser 


His 


TrD 


Lys 


Leu 


Lys 


Asn 


Ala 








420 


Ala 


Pro 


Val 


Gly 






435 




Ser 


Ala 


Pro 


Asn 




450 






Glu 


Arg 


Ala 


Tvr 


465 








Pro 


Thr 


Gin 


Pro 


Gly 


Gin 


Ser 


Pro 








500 


Pro 


Met 


Glv 


val 






515 




Ser 


As p 


Ser 


Met 










Ser 


Glu 


Asn 


Ala 


545 








Gin 


Pro 


Ser 


Thr 


Gin 


Asp 


Leu 


Arg 








580 


Pro 


Thr 


Pro 


Asp 






595 





Leu 


He 


Asn 


Ser 








120 


Ser 


Pro 


Asn 


Met 






135 




Gin 


Ser 


Thr 


Gly 




15 0 






Met 


Asn 


Thr 


Gly 


165 








Glv 


Asn 


Gly 


Gin 


Gly 


Ala 


Gly 


Arg 








200 


Glv 


Ser 


Ala 


Glv 






215 




Gin 


Met 


GlV 
y 


Glv 




230 






Glv 


Met 


Met 


Asn 


245 








Pro 


Glv 


Gin 


Gin 


Lys 


Thr 


Val 


Leu 








280 


Ala 


Val 


Pro 


Glv 






295 




Gin 


Val 


Gin 


Gin 




310 






Ser 


Glv 


Ala 


His 


325 








Gin 


Leu 


Val 


Leu 


Ala 


Asn 


Gly 


Glu 








360 


Lys 


Asn 


Val 


Leu 




375 




Gin 


Val 


Ala 


His 




390 






Asn 


Cys 


Thr 


Arg 


405 








Gly 


Asp 


Lys 


Arg 


Leu 


Gly 


Asn 


Pro 








440 


Leu 


Ser 


Thr 


Val 






455 




Ala 


Ala 


Leu 


Glv 




470 






Gin 


Val 


Gin 


Ala 


485 








Gin 


Glv 


Met 


Arq 


Asn 


Gly 


Gly 


Val 








520 


Leu 


His 


Ser 


Ala 






535 




Ser 


Val 


Pro 


Ser 




550 






Thr 


Gly 


He 


Arg 


565 








Asn 


His 


Leu 


Val 


Pro 


Ala 


Ala 


Leu 



600 



77 



Met 


Val 


Lys 


Ser 


Gly 


Met 


Gly 


Thr 








140 


Met 


Met 


Asn 


Ser 






155 




Thr 


Asn 


Ala 


Gly 




170 






Glv 


He 


Met 


Pro 


185 








Glv 
y 


Arg 


Gin 


Asp 


Asn 


Leu 


Leu 


Thr 








220 


Gin 


Thr 


Glv 


Leu 






235 




Asn 


Pro 


Asn 


Pro 




250 






lie 


Gly 


Ala 


Ser 


265 








Ser 


Asn 


Asn 


Leu 


Gly 


Gly 


Met 


Pro 








300 


Pro 


GlV 


Leu 


Val 






315 




Thr 


Ala 


Asp 


Pro 




330 






Leu 


Leu 


His 


Ala 


345 








Val 


Arg 


Gin 


Cvs 


Asn 


His 


Met 


Thr 








380 


Cvs 


Ala 


Ser 


Ser 






395 




His 


Asp 


Cys 


Pro 




410 






Asn 


Gin 


Gin 


Pro 


425 








Ser 


Ser 


Leu 


Glv 


Ser 


Gin 


He 


Asp 








460 


Leu 


Pro 


Tvr 

y 


Gin 






475 




Lys 


Asn 


Gin 


Gin 




490 






Pro 


Mer. 


Ser 


Asn 


505 








Glv 
y 


va \ 


Gin 


Thr 


He 


Asn 


' ■ - 


Gin 








540 


Leu 


G i y 


i ' I O 

; j f» 1 > 


Met 


Lys 


Gin 


~ rp 


Hxs 




570 






His 


Lys 


Leu 


Val 


585 








Lys 


Asp 


Arg 


Arg 



Pro Met Thr Gin 
125 

Ser Gly Pro Asn 

Pro Val Asn Gin 
160 

Met Asn Pro Gly 
175 

Asn Gin Val Met 
190 

Met Gin Tyr Pro 
205 

Glu Pro Leu Gin 

Arg Gly Pro Gin 
240 

Tyr Gly Ser Pro 
255 

Gly Leu Gly Leu 
270 

Ser Pro Phe Ala 
285 

Asn Met Gly Gin 

Thr Pro Val Ala 
320 

Glu Lys Arg Lys 
335 

His Lys Cys Gin 
350 

Asn Leu Pro His 
365 

His Cys Gin Ser 

Arg Gin He lie 
400 

Val Cys Leu Pro 
415 

He Leu Thr Gly 
430 

Val Gly Gin Gin 
445 

Pro Ser Ser He 

Val Asn Gin Met 
480 

Asn Gin Gin Pro 
495 

Met Ser Ala Ser 
510 

Pro Ser Leu Leu 
525 

Asn Pro Met Met 

Pro Thr Ala Ala 
560 

Glu Asp He Thr 
575 

Gin Ala He Phe 
590 

Met Glu Asn Leu 
605 
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Val Ala Tyr Ala Arg Lys Val Glu Gly Asp Met Tyr Glu Ser Ala Asn 

610 615 620 

Asn Arg Ala Glu Tyr Tyr His Leu Leu Ala Glu Lys lie Tyr Lys lie 
625 630 635 640 

Gin Lys Glu Leu Glu Glu Lys Arg Arg Thr Arg Leu Gin Lys Gin Asn 

645 650 655 

Met Leu Pro Asn Ala Ala Gly Met Val Pro Val Ser Met Asn Pro Gly 

660 665 670 

Pro Asn Met Gly Gin Pro Gin Pro Gly Met Thr Ser Asn Gly Pro Leu 

675 680 685 

Pro Asp Pro Ser Met lie Arg Gly Ser Val Pro Asn Gin Met Met Pro 

690 695 700 

Arg lie Thr Pro Gin Ser Gly Leu Asn Gin Phe Gly Gin Met Ser Met 
705 710 715 720 

Ala Gin Pro Pro lie Val Pro Arg Gin Thr Pro Pro Leu Gin His His 

725 730 735 

Gly Gin Leu Ala Gin Pro Gly Ala Leu Asn Pro Pro Met Gly Tyr Gly 

740 745 750 

Pro Arg Met Gin Gin Pro Ser Asn Gin Gly Gin Phe Leu Pro Gin Thr 

755 760 765 

Gin Phe Pro Ser Gin Gly Met Asn Val Thr Asn lie Pro Leu Ala Pro 

770 ' 775 780 

Ser Ser Gly Gin Ala Pro Val Ser Gin Ala Gin Met Ser Ser Ser Ser 
785 790 795 800 

Cys Pro Val Asn Ser Pro lie Met Pro Pro Gly Ser Gin Gly Ser His 

805 810 815 

lie His Cys Pro Gin Leu Pro Gin Pro Ala Leu His Gin Asn Ser Pro 

820 825 830 

Ser Pro Val Pro Ser Arg Thr Pro Thr Pro His His Thr Pro Pro Ser 

835 840 845 

lie Gly Ala Gin Gin Pro Pro Ala Thr Thr lie Pro Ala Pro Val Pro 

850 855 860 

Thr Pro Pro Ala Met Pro Pro Gly Pro Gin Ser Gin Ala Leu His Pro 
865 870 875 880 

Pro Pro Arg Gin Thr Pro Thr Pro Pro Thr Thr Gin Leu Pro Gin Gin 

885 890 895 

Val Gin Pro Ser Leu Pro Ala Ala Pro Ser Ala Asp Gin Pro Gin Gin 

900 905 910 

Gin Pro Arg Ser Gin Gin Ser Thr Ala Ala Ser Val Pro Thr Pro Asn 

915 920 925 

Ala Pro Leu Leu Pro Pro Gin Pro Ala Thr Pro Leu Ser Gin Pro Ala 

930 935 940 

Val Ser lie Glu Gly Gin Val Ser Asn Pro Pro Ser Thr Ser Ser Thr 
945 950 955 960 

Glu Val Asn Ser Gin Ala lie Ala Glu Lys Gin Pro Ser Gin Glu Val 

965 970 975 

Lys Met Glu Ala Lys Met Glu Val Asp Gin Pro Glu Pro Ala Asp Thr 

980 985 990 

Gin Pro Glu Asp lie Ser Glu Ser Lys Val Glu Asp Cys Lys Met Glu 

995 1000 1005 

Ser Thr Glu Thr Glu Glu Arg Ser Thr Glu Leu Lys Thr Glu lie Lys 

1010 1015 1020 

Glu Glu Glu Asp Gin Pro Ser Thr Ser Ala Thr Gin Ser Ser Pro Ala 
025 1030 lvs5 1040 

Pro Gly Gin Ser Lys Lys Lys lie Phe Lys ?ro Glu Glu Leu Arg Gin 

1045 1050 1055 

Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg Gin Asp Pro Glu Ser 

1060 1065 1070 

Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu Leu Gly lie Pro Asp 

1075 1080 1085 

Tyr Phe Asp lie Val Lys Ser Pro Met Asp Leu Ser Thr lie Lys Arg 
1090 1095 1100 
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Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp Gin Tyr Val Asp Asp 
105 1110 1115 1120 

lie Trp Leu Met Phe Asn Asn Ala Trp Leu Tyr Asn Arg Lys Thr Ser 

1125 1130 1135 

Arg Val Tyr Lys Tyr Cys Ser Lys Leu Ser Glu Val Phe Glu Gin Glu 

1140 1145 1150 

lie Asp Pro Val Met Gin Ser Leu Gly Tyr Cys Cys Gly Arg Lys Leu 

1155 1160 1165 

Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly Lys Gin Leu Cys Thr 

1170 1175 1180 

lie Pro Arg Asp Ala Thr Tyr Tyr Ser Tyr Gin Asn Arg Tyr His Phe 
185 1190 1195 1200 

Cys Glu Lys Cys Phe Asn Glu lie Gin Gly Glu Ser Val Ser Leu Gly 

1205 1210 1215 

Asp Asp Pro Ser Gin Pro Gin Thr Thr lie Asn Lys Glu Gin Phe Ser 

1220 1225 1230 

Lys Arg Lys Asn Asp Thr Leu Asp Pro Glu Leu Phe Val Glu Cys Thr 

1235 1240 1245 

Glu Cys Gly Arg Lys Met His Gin lie Cys Val Leu His His Glu lie 

1250 1255 1260 

lie Trp Pro Ala Gly Phe Val Cys Asp Gly Cys Leu Lys Lys Ser Ala 
265 1270 1275 1280 

Arg Thr Arg Lys Glu Asn Lys Phe Ser Ala Lys Arg Leu Pro Ser Thr 

1285 1290 1295 

Arg Leu Gly Thr Phe Leu Glu Asn Arg Val Asn Asp Phe Leu Arg Arg 

1300 1305 1310 

Gin Asn His Pro Glu Ser Gly Glu Val Thr Val Arg Val Val His Ala 

1315 1320 1325 

Ser Asp Lys Thr Val Glu Val Lys Pro Gly Met Lys Ala Arg Phe Val 

1330 1335 1340 

Asp Ser Gly Glu Met Ala Glu Ser Phe Pro Tyr Arg Thr Lys Ala Leu 
345 1350 1355 1360 

Phe Ala Phe Glu Glu lie Asp Gly Val Asp Leu Cys Phe Phe Gly Met 

1365 1370 1375 

His Val Gin Glu Tyr Gly Ser Asp Cys Pro Pro Pro Asn Gin Arg Arg 

1380 1385 . 1390 

Val Tyr lie Ser Tyr Leu Asp Ser Val His Phe Phe Arg Pro Lys Cys 

1395 1400 1405 

Leu Arg Thr Ala Val Tyr His Glu lie Leu lie Gly Tyr Leu Glu Tyr 

1410 1415 1420 

Val Lys Lys Leu Gly Tyr Thr Thr Gly His lie Trp Ala Cys Pro Pro 
425 1430 1435 1440 

Ser Glu Gly Asp Asp Tyr lie Phe His Cys His Pro Pro Asp Gin Lys 

1445 1450 1455 

lie Pro Lys Pro Lys Arg Leu Gin Glu Trp -Tyr Lys Lys Met Leu Asp 

1460 1465 1470 

Lys Ala Val Ser Glu Arg lie Val His Asp Tyr Lys Asp lie Phe Lys 

1475 1480 1485 

Gin Ala Thr Glu Asp Arg Leu Thr Ser Ala Lys Glu Leu Pro Tyr Phe 

1490 1495 1500 

Glu Gly Asp Phe Trp Pro Asn Val Leu Glu Glu Ser lie Lys Glu Leu 
505 1510 1515 1520 

Glu Gin Glu Glu Glu Glu Arg Lys Arg Glu Glu Asn Thr Ser Asn Glu 

1525 1530 1535 

Ser Thr Asp Val Thr Lys Gly Asp Ser Lys Asn Ala Lys Lys Lys Asn 

1540 1545 1550 

Asn Lys Lys Thr Ser Lys Asn Lys Ser Ser Leu Ser Arg Gly Asn Lys 

1555 1560 1565 

Lys Lys Pro Gly Met Pro Asn Val Ser Asn Asp Leu Ser Gin Lys Leu 

1570 ' 1575 1580 

Tyr Ala Thr Met Glu Lys His Lys Glu Val Phe Phe Val lie Arg Leu 
585 1590 1595 1600 
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He Ala Gly Pro Ala Ala Asn Ser Leu Pro Pro He Val Asp Pro Asp 

1605 1610 1615 

Pro Leu He Pro Cys Asp Leu Met Asp Gly Arg Asp Ala Phe Leu Thr 

1620 1625 1630 

Leu Ala Arg Asp Lys His Leu Glu Phe Ser Ser Leu Arg Arg Ala Gin 

1635 1640 1645 

Trp Ser Thr Met Cys Met Leu Val Glu Leu His Thr Gin Ser Gin Asp 

1650 1655 1660 

Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys His His Val Glu Thr Arg 
665 1670 1675 1680 

Trp His Cys Thr Val Cys Glu Asp Tyr Asp Leu Cys lie Thr Cys Tyr 

1685 1690 1695 

Asn Thr Lys Asn His Asp His Lys Met Glu Lys Leu Gly Leu Gly Leu 

1700 1705 1710 

Asp Asp Glu Ser Asn Asn Gin Gin Ala Ala Ala Thr Gin Ser Pro Gly 

1715 1720 1725 

Asp Ser Arg Arg Leu Ser He Gin Arg Cys He Gin Ser Leu Val His 

1730 1735 .1740 

Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser Leu Pro Ser Cys Gin Lys 
745 1750 1755 1760 

Met Lys Arg Val Val Gin His Thr Lys Gly Cys Lys Arg Lys Thr Asn 

1765 1770 1775 

Gly Gly Cys Pro He Cys Lys Gin Leu He Ala Leu Cys Cys Tyr His 

1780 1785 1790 

Ala Lys His Cys Gin Glu Asn Lys Cys Pro Val Pro Phe Cys Leu Asn 

1795 1800 1805 

He Lys Gin Lys Leu Arg Gin Gin Gin Leu Gin His Arg Leu Gin Gin 

1810 1815 1820 

Ala Gin Met Leu Arg Arg Arg Met Ala Ser Met Gin Arg Thr Gly Val 
825 1830 1835 1840 

Val Gly Gin Gin Gin Gly Leu Pro Ser Pro Thr Pro Ala Thr Pro Thr 

1845 1850 1855 

Thr Pro Thr Gly Gin Gin Pro Thr Thr Pro Gin Thr Pro Gin Pro Thr 

I860 1865 1870 

Ser Gin Pro Gin Pro Thr Pro Pro Asn Ser Met Pro Pro Tyr Leu Pro 

1875 1880 1885 

Arg Thr Gin Ala Ala Gly Pro Val Ser Gin. Gly Lys Ala Ala Gly Gin 

1890 1895 1900 

Val Thr Pro Pro Thr Pro Pro Gin Thr Ala Gin Pro Pro Leu Pro Gly 
905 1910 1915 1920 

Pro Pro Pro Thr Ala Val Glu Met Ala Met Gin He Gin Arg Ala Ala 

1925 1930 1935 

Glu Thr Gin Arg Gin Met Ala His Val Gin He Phe Gin Arg Pro He 

1940 1945 1950 

Gin His Gin Met Pro Pro Met. Thr Pro Met Ala Pro Met Gly Met Asn 

1955 1960 1965 

Pro Pro Pro Met Thr Arg Gly Pro Ser Gly His Leu Glu Pro Gly Met 

1970 1975 1980 

Gly Pro Thr Gly Met Gin Gin Gin Pro Pro Trp Ser Gin Gly Gly Leu 
985 1990 1995 2000 

Pro Gin Pro Gin Gin Leu Gin Ser Gly Met Pro Arg Pro Ala Met Met 

2005 2010 2015 

Ser Val Ala Gin His Gly Gin Pro Leu Asn Met Ala Pro Gin Pro Gly 

2020 2025 2030 

Leu Gly Gin Val Gly He Ser Pro Leu Lys Pro Gly Thr Val Ser Gin 

2035 2040 2045 

Gin Ala Leu Gin Asn Leu Leu Arg Thr Leu Arg Ser Pro Ser Ser Pro 

2050 2055 2060 

Leu Gin Gin Gin Gin Val Leu Ser lie Leu His Ala Asn Pro Gin Leu 
065 2070 2075 2080 

Leu Ala Ala Phe He Lys Gin Arg Ala Ala Lys Tyr Ala Asn Ser Asn 
2085 2090 2095 
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Pro Gin Pro lie Pro Gly Gin Pro Gly Met Pro Gin Gly Gin Pro Gly 

2100 2105 2110 

Leu Gin Pro Pro Thr Met Pro Gly Gin Gin Gly Val His Ser Asn Pro 

2115 2120 1 2125 

Ala Met Gin Asn Met Asn Pro Met Gin Ala Gly Val Gin Arg Ala Gly 

2130 2135 2140 

Leu Pro Gin Gin Gin Pro Gin Gin Gin Leu Gin Pro Pro Met Gly Gly 
145 2150 2155 2160 

Met Ser Pro Gin Ala Gin Gin Met Asn Met Asn His Asn Thr Met Pro 

2165 2170 2175 

Ser Gin Phe Arg Asp lie Leu Arg Arg Gin Gin Met Met Gin Gin Gin 

2180 2185 2190 

Gin Gin Gin Gly Ala Gly Pro Gly lie Gly Pro Gly Met Ala Asn His 

2195 2200 2205 

Asn Gin Phe Gin Gin Pro Gin Gly Val Gly Tyr Pro Pro Gin Pro Gin 

2210 2215 2220 

Gin Arg Met Gin His His Met Gin Gin Met Gin Gin Gly Asn Met Gly 
225 2230 2235 2240 

Gin lie Gly Gin Leu Pro Gin Ala Leu Gly Ala Glu Ala Gly Ala Ser 

2245 2250 2255 

Leu Gin Ala Tyr Gin Gin Arg Leu Leu Gin Gin Gin Met Gly Ser Pro 

2260 2265 2270 

Val Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu Pro Asn Gin 

2275 2280 2285 

Ala Gin Ser Pro His Leu Gin Gly Gin Gin lie Pro Asn Ser Leu Ser 

2290 2295 2300 

Asn Gin Val Arg Ser Pro Gin Pro Val Pro Ser Pro Arg Pro Gin Ser 
305 2310 2315 2320 

Gin Pro Pro His Ser Ser Pro Ser Pro Arg Met Gin Pro Gin Pro Ser 

2325 2330 2335 

Pro His His Val Ser Pro Gin Thr Ser Ser Pro His Pro Gly Leu Val 

2340 2345 2350 

Ala Ala Gin Ala Asn Pro Met Glu Gin Gly His Phe Ala Ser Pro Asp 

2355 2360 2365 

Gin Asn Ser Met Leu Ser Gin Leu Ala Ser Asn Pro Gly Met Ala Asn 

2370 2375 2380 

Leu His Gly Ala Ser Ala Thr Asp Leu Gly Leu Ser Thr Asp Asn Ser 
385 2390 2395 2400 

Asp Leu Asn Ser Asn Leu Ser Gin Ser Thr Leu Asp lie His 
2405 2410 2 

(-2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 4 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xij SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Ala Glu Asn Leu Leu Asp Gly Pro Pro Asn Pro Lys Arg Ala Lys 

1 5 10 15 

Leu Ser Ser Pro Gly Phe Ser Ala Asn Asp Asn Thr Asp Phe Gly Ser 

20 25 30 

Leu Phe Asp Leu Glu Asn Asp Leu Pro Asp Glu Leu lie Pro Asn Gly. 

35 40 45 

Glu Leu Ser Leu Leu Asn Ser Gly Asn Leu Val Pro Asp Ala Ala Ser 

50 55 60 

Lys His Lys Gin Leu Ser Glu Leu Leu Arg Gly Gly Ser Gly Ser Ser 
65 70 75 80 
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qi t; 




















z? c. -~) 








lr X. U 


X 1 


XT X t-» 


Gin 


Thr 


Pro 


Val 


Gin 


Pro 


Pro 


Ser 


Val 


Ala 


Thr 


Pro 


Gin 




q ^ n 










935 










940 












Q » r* 


ci 1 ri 


OX 11 


ul XI 


Pro 


Thr 


Pro 


Val 


His 


Thr 


Gin 


Pro 


Pro 


Gly 


Thr 


945 










950 










955 










960 


Pro 


Leu 


Ser 


Gin 


Ala 


Ala 


Ala 


Ser 


lie 


Asp 


Asn 


Arg 


Val 


Pro 


Thr 


Pro 










965 










97 0 










975 




Ser 


Thr 


v a. J. 


1 XXX 


Ser 


Ala 


Glu 


Thr 


C q r 
OC 1 


Ser 


Gin, 


Gin 


Pro 


Gly 


Pro 


Asp 








Q ft fl 
^ O U 










Qft S 










990 






Val 


Pro 


Met 


Leu 


Glu 


Met 


Lys 


Thr 


Glu 


Val 


Gin 


Thr 


Asp 


Asp 


Ala 


Glu 






995 








1000 








1005 








Pro 


Glu 


Pro 


Thr 


Glu 


Ser 


Lys 


Gly 


Glu 


Pro 


Arg 


Ser 


Glu 


Met 


Met 


GlU 


1010 








1015 








1020 










Glu 


Asp 


Leu 


Gin 


Gly 


Ser 


Ser 


Gin 


Val 


Lys 


Glu 


Glu 


Thr 


Asp 


Thr 


Thr 


025 








1030 










1035 








1040 


Glu 


Gin 


Lys 


Ser 


Glu 


Pro 


Met 


Glu 


Val 


Glu 


Glu 


Lys 


Lys 


Pro 


Glu 


Val 










1045 










1050 










1055 




Lys 


Val 


Glu 


Ala 


Lys 


Glu 


Glu 


Glu 


Glu 


Asn 


Ser 


Ser 


Asn 


Asp 


Thr 


Ala 



1060 1065 1070 
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Ser Gin Ser Thr Ser Pro Ser Gin Pro Arg Lys Lys lie Phe Lys Pro 

1075 1080 1085 

Glu Glu Leu Arg Gin Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg 

1090 1095 1100 

Gin Asp Pro Glu Ser Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu 
105 1110 1115 1120 

Leu Gly He Pro Asp Tyr Phe Asp He Val Lys Asn Pro Met Asp Leu 

1125 1130 1135 

Ser Thr lie Lys Arg Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp 

1140 1145 1150 

Gin Tyr Val Asp Asp Val Arg Leu Met Phe Asn Asn Ala Trp Leu Tyr 

1155 1160 1165 

Asn Arg Lys Thr Ser Arg Val Tyr Lys Phe Cys Ser Lys Leu Ala Glu 

1170 H75 1180 

Val Phe Glu Gin Glu lie Asp Pro Val Met Gin Ser Leu Gly Tyr Cys 
185 H90 1195 1200 

Cys Gly Arg Lys Tyr Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly 

1205 1210 1215 

Lys Gin Leu Cys Thr lie Pro Arg Asp Ala Ala Tyr Tyr Ser Tyr Gin 

1220 1225 1230 

Asn Arg Tyr His Phe Cys Gly Lys Cys Phe Thr Glu lie Gin Gly Glu 

1235 1240 1245 

Asn Val Thr Leu Gly Asp Asp Pro Ser Gin Pro Gin Thr Thr He Ser 

1250 1255 1260 

Lys Asp Gin Phe Glu Lys Lys Lys Asn Asp Thr Leu Asp Pro Glu Pro 
265 1270 1275 1280 

Phe Val Asp Cys Lys Glu Cys Gly Arg Lys Met His Gin He Cys Val 

1285 1290 1295 

Leu His Tyr Asp He He Trp Pro Ser Gly Phe Val Cys Asp Asn Cys 

1300 1305 1310 

Leu Lys Lys Thr Gly Arg Pro Arg Lys Glu Asn Lys Phe Ser Ala Lys 

1315 1320 1325 

Arg Leu Gin Thr Thr Arg Leu Gly Asn His Leu Glu Asp Arg Val Asn 

1330 1335 1340 

Lys Phe Leu Arg Arg Gin Asn His Pro Glu Ala Gly Glu Val Phe Val 
345 1350 1355 1360 

Arg Val Val Ala Ser Ser Asp Lys Thr Val Glu Val Lys Pro Gly Met 

1365 1370 1375 

Lys Ser Arg Phe Val Asp Ser Gly Glu Met Ser Glu Ser Phe Pro Tyr 

1380 1385 1390 

Arg Thr Lys Ala Leu Phe Ala Phe Glu Glu He Asp Gly Val Asp Val 

1395 1400 1405 

Cys Phe Phe Gly Met His Val Gin Asp Thr Ala Leu He Ala Pro His 

1410 1415 1420 

Gin He Gin Gly Cys Val Tyr He Ser Tyr Leu Asp Ser lie His Phe 
425 1430 1435 1440 

Phe Arg Pro Arg Cys Leu Arg Thr Ala Val Tyr His Glu He Leu He 

1445 1450 1455 

Gly Tyr Leu Glu Tyr Val Lys Lys Leu Val Tyr Val Thr Ala His He 

1460 1465 1470 

Trp Ala Cys Pro Pro Ser Glu Gly Asp Asp Tyr He Phe His Cys His 

1475 1480 1485 

Pro Pro Asp Gin Lys He Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr 

1490 1495 1500 

Lys Lys Met Leu Asp Lys Ala Phe Ala Glu Arg He He Asn Asp Tyr 
505 1510 1515 1520 

Lys Asp He Phe Lys Gin Ala Asn Glu Asp Arg Leu Thr Ser Ala Lys 

1525 1530 1535 

Glu Leu Pro Tyr Phe Glu Gly Asp Phe Trp Pro Asn Val Leu Glu Glu 

. 1540 1545 1550 ' 

Ser He Lys Glu Leu Glu Gin Glu Glu Glu Glu Arg Lys Lys Glu Glu 
1555 1560 1565 
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Ser Thr Ala Ala Ser Glu Thr Pro Glu Gly Ser Gin Gly Asp Ser Lys 

1570 1575 1580 

Asn Ala Lys Lys Lys Asn Asn Lys Lys Thr Asn Lys Asn Lys Ser Ser 
585 1590 1595 1600 

lie Ser Arg Ala Asn Lys Lys Lys Pro Ser Met Pro Asn Val Ser Asn 

1605 1610 1615 

Asp Leu Ser Gin Lys Leu Tyr Ala Thr Met Glu Lys His Lys Glu Val 

1620 1625 1630 

Phe Phe Val lie His Leu His Ala Gly Pro Val lie Ser Thr Gin Pro 

1635 1640 1645 

Pro lie Val Asp Pro Asp Pro Leu Leu Ser Cys Asp Leu Met Asp Gly 

1650 1655 1660 

Arg Asp Ala Phe Leu Thr Leu Ala Arg Asp Lys His Trp Glu Phe Ser 
665 1670 1675 1680 

Ser Leu Arg Arg Ser Lys Trp Ser Thr Leu Cys Met Leu Val Glu Leu 

1685 1690 1695 

His Thr Gin Gly Gin Asp Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys 

1700 1705 1710 

His His Val Glu Thr Arg Trp His Cys Thr Val Cys Glu Asp Tyr Asp 

1715 . 1720 1725 

Leu Cys lie Asn Cys Tyr Asn Thr Lys Ser His Thr His Lys Met Val 

1730 1735 1740 

Lys Trp Gly Leu Gly Leu Asp Asp Glu Gly Ser Ser Gin Gly Glu Pro 
745 1750 1755 1760 

Gin Ser Lys Ser Pro Gin Glu Ser Arg Arg Leu Ser lie Gin Arg Cys 

1765 1770 1775 

lie Gin Ser Leu Val His Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser 

1780 1785 1790 

Leu Pro Ser Cys Gin Lys Met Lys Arg .Val Val Gin His Thr Lys Gly 

1795 1800 1805 

Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys Lys Gin Leu lie 

1810 1815 1820 

Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu Asn Lys Cys Pro 
825 1830 1835 1840 

Val Pro Phe Cys Leu Asn lie Lys His Asn Val Arg Gin Gin Gin lie 

1845 1850 1855 

Gin His Cys Leu Gin Gin Ala Gin Leu Met Arg Arg Arg Met Ala Thr 

1860 1865 1870 

Met Asn Thr Arg Asn Val Pro Gin Gin Ser Leu Pro Ser Pro Thr Ser 

1875 1880 1885 

Ala Pro Pro Gly Thr Pro Thr Gin Gin Pro Ser Thr Pro Gin Thr Pro 

1890 1895 1900 

Gin Pro Pro Ala Gin Pro Gin Pro Ser Pro Val Asn Met Ser Pro Ala 
905 1910 1915 1920 

Gly Phe Pro Asn Val Ala Arg Thr Gin Pro Pro Thr lie Val Ser Ala 

1925 1930 1935 

Gly Lys Pro Thr Asn Gin Val Pro Ala Pro Pro Pro Pro Ala Gin Pro 

1940 1945 1950 

Pro Pro Ala Ala Val Glu Ala Ala Arg Gin lie Glu Arg Glu Ala Gin 

1955 1960 1965 

Gin Gin Gin His Leu Tyr Arg Ala Asn lie Asn Asn Gly Met Pro Pro 

1970 1975 1980 

Gly Arg Asp Gly Met Gly Thr Pro Gly Ser Gin Met Thr Pro Val Gly 
985 1990 1995 2000 

Leu Asn Val Pro Arg Pro Asn Gin Val Ser Gly Pro Val Met Ser Ser 

2005 2010 2015 

Met Pro Pro Gly Gin Trp Gin Gin Ala Pro lie Pro Gin Gin Gin Pro 

2020 2025 2030 

Met Pro Gly Met Pro Arg Pro Val Met Ser Met Gin Ala Gin Ala Ala 

2035 2040 2045 

Val Ala Gly Pro Arg Met Pro Asn Val Gin Pro Asn Arg Ser lie Ser 
2050 2055 2060 
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Pro Ser Ala Leu Gin Asp Leu Leu Arg Thr Leu Lys Ser Pro Ser Ser 
065 2070 2075 2080 

Pro Gin Gin Gin Gin Gin Val Leu Asn lie Leu Lys Ser Asn Pro Gin 

2085 2090 2095 

Leu Met Ala Ala Phe lie Lys Gin Arg Thr Ala Lys Tyr Val Ala Asn 

2100 2105 2110 

Gin Pro Gly Met Gin Pro Gin Pro Gly Leu Gin Ser Gin Pro Gly Met 

2115 2120 2125 

Gin Pro Gin Pro Gly Met His Gin Gin Pro Ser Leu Gin Asn Leu Asn 

2130 2135 2140 

Ala Met Gin Ala Gly Val Pro Arg Pro Gly Val Pro Pro Pro Gin Pro 
145 ■ 2150 2155 2160 

Ala Met Gly Gly Leu Asn Pro Gin Gly Gin Ala Leu Asn lie Met Asn 

2165 2170 2175 

Pro Gly His Asn Pro Asn Met Thr Asn Met Asn Pro Gin Tyr Arg Glu 

2180 2185 2190 

Met Val Arg Arg Gin Leu Leu Gin His Gin Gin Gin Gin Gin Gin Gin 

2195 2200 2205 

Gin Gin Gin Gin Gin Gin Gin Gin Asn Ser Ala Ser Leu Ala Gly Gly 

2210 2215 2220 

Met Ala Gly His Ser Gin Phe Gin Gin Pro Gin Gly Pro Gly Gly Tyr 
225 2230 2235 2240 

Ala Pro Ala Met Gin Gin Gin Arg Met Gin Gin His Leu Pro lie Gin 

2245 2250 2255 

Gly Ser Ser Met Gly Gin Met Ala Ala Pro Met Gly Gin Leu Gly Gin 

2260 2265 2270 

Met Gly Gin Pro Gly Leu Gly Ala Asp Ser Thr Pro Asn He Gin Gin 

2275 2280 2285 

Ala Leu Gin Gin Arg He Leu Gin Gin Gin Gin Met Lys Gin Gin lie 
■ 2290 2295 2300 

Gly Ser Pro Gly Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu 
305 2310 2315 2320 

Ser Gly Gin Pro Gin Ala Ser His Leu Pro Gly Gin Gin lie Ala Thr 

2325 2330 . 2335 

Ser Leu Ser Asn Gin Val Arg Ser Pro Ala Pro Val Gin Ser Pro Arg 

2340 2345 2350 

Pro Gin Ser Gin Pro Pro His Ser Ser Pro Ser Pro Arg He Gin Pro 

2355 2360 2365 

Gin Pro Ser Pro His His Val Ser Pro Gin Thr Gly Thr Pro His Pro 

2370 2375 2380 

Gly Leu Ala Val Thr Met Ala Ser Ser Met Asp Gin Gly His Leu Gly 
385 2390 2395 2400 

Asn Pro Glu Gin Ser Ala Met Leu Pro Gin Leu Asn Thr Pro Asn Arg 

2405 2410 2415 

Ser Ala Leu Ser Ser Glu Leu Ser Leu Val Gly Asp Thr Thr Gly Asp 

2420 2425 2430 

Thr Leu Glu Lys Phe Val Glu Gly Leu 
2435 2440 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 813 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear . 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



Met Ala Glu Ala Gly Gly Ala Gly Ser Pro Ala Leu Pro Pro Ala Pro 
15 10 15 
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Pro 


His 


Glv 


Ser 


Pro 


Arg 


Thr 


Leu 








20 










Ser 


Cys 


Glv 


Pro 


Ala 


Thr 


Pro 


Val 




35 










40 


Pro 


Glv 


Glv 


Gly 


Gly 


Ser 


Ala 


Arg 




50 










55 




Arg 


Ser 


Ala 


Pro 


Arg 


Ala 


Lys 


Lys 


65 










70 






Ala 


Cys 


Lys 


Ala 


Glu 


Glu 


Ser 


Cys 










85 . 








Asn 


Pro 


Ser 


Pro 


Thr 


Pro 


Pro 


Arg 








100 










Ser 


Leu 


Thr 


Glu 


Ser 


Cys 


Arg 


Ser 






115 










120 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 




130 










135 




Glv 


lie 


Val 


Leu 


Asp 


Val 


Glu 


Tyr 


145 










150 






Glu 


Asp 


Ala 


Asp 


Thr 


Lys 


Gin 


Val 










165 








Arg 


LVS 


Ser 


He 


Leu 


Gin 


Arg 


Gly 








180 










Glu 


LVS 


Lys 


Pro 


Pro 


Phe 


Glu 


Lys 






195 










200 


Asn 


Phe 


Val 


Gin 


Tvr 


Lys 


Phe 


Ser 




210 










215 




Thr 


Thr 


lie " 


Glu 


Leu . 


Ala 


Lys 


Met 


225 










230 






His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Arg 










245 








lie 


Ser 


Glv 


Tvr 


Lys 


Glu 


Asn 


Tyr 








260 










Val 


Pro 


Gin 


Phe 


Cvs 


Asp 


Ser 


Leu 






275 










280 


Phe 


Glv 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 




290 










295 




Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Lvs 


305 










310 






Ar g 


Thr 


Leu 


He 


Leu 


Thr 


His 


Phe 










325 








Glu 


Glu 


Val 


Tvr 


Ser 


Gin 


Asn 


Ser 








340 










Ser 


Ala 


Ser 


Ser 


Arg 


Thr 


Ser 


Pro 






355 










360 


Pro 


Pro 


Val 


Thr 


Glv 


Thr 


Ala 


Leu 




370 










375 




Glu 


Gin 


lie 


Asn 


Glv 


Gly 


Arg 


Thr 


385 










390 






Glv 


Leu 


Glu 


Ala 


Asn 


Pro 


Gly 


Glu 










405 








Ala 


Pro 


Glu 


Glu 


Ala 


Lvs 


Arq 


Ser 








420 










Glu 


Leu 


He' 


Asn 


Glu 


Val 


Met 


Ser 






435 










440 


Leu 


Gly 


Pro 


Glu 


Thr 


Asn 


Phe 


Leu 




450 










455 




Ala 


Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


465 










470 






Gly 


Asn 


Ser 


Leu 


Asn 


Gin 


Lys 


Pro 










485 








Val 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 



500 



87 



Ala 


Thr 


Ala 


Ala 


Gly 


Ser 


Ser 


Ala 


25 










30 






Ala 


Ala 


Ala 


Gly 


Thr 


Ala 


Glu 


Gly 










45 








He 


Ala 


Val 


Lys 


Lys 


Ala 


Gin 


Leu 








60 










Leu 


Glu 


Lys 


Leu 


Gly 


Val 


Tyr 


Ser 






75 










80 


Lys 


Cys 


Asn 


Gly 


Trp 


Lys 


Asn 


Pro 




90 










95 




Gly 


Asp 


Leu 


Gin 


Gin 


He 


He 


Val 


105 










110 






Cvs 


Ser 


His 


Ala 


Leu 


Ala 


Ala 


His 










125 








Glu 


Glu 


Glu 


Met 


Asp 


Arg 


Leu 


Leu 








140 










Leu 


Phe 


Thr 


Cys 


Val 


His 


Lys 


Glu 






155 










.160 


Tyr 


Phe 


Tyr 


Leu 


Phe 


Lys 


Leu 


Leu 




170 










175 




Lys 


Pro 


Val 


Val 


Glu 


Gly 


Ser 


Leu 


185 










190 






Pro 


Ser 


He 


Glu 


Gin 


Gly 


Val 


Asn 










205 








His 


Leu 


Pro 


Ser 


Lys 


Glu 


Arg 


Gin 








220 










Phe 


Leu 


Asn 


Arq 


He 


Asn 


Tyr 


Trp 






235 










240 


Arg 


Leu 


Arg 


Ser 


Pro 


Asn 


Asp 


Asp 




250 










255 




Thr 


Ara 


Trp 

tr 


Leu 


Cvs 


Tyr 


cvs 


Asn 


265 










270 






Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


Lys 


Val 










285 








Val 


Phe 


Thr 


He 


Met 


Arg 


Arq 


Gin 








300 










Lys 


Asp 


Lys 


Leu 


Pro 


Leu 


Glu 


Lys 






315 










320 


Pro 


Lys 


Phe 


Leu 


Ser 


Met 


Leu 


Glu 




330 










335 




Pro 


He 


Trp 


Asp 


Gin 


Asp 


Phe 


Leu 


345 










350 






Leu 


Gly 


He 


Gin 


Thr 


Val 


He 


Ser 










365 








Phe 


Ser 


Ser 


Asn 


Ser 


Thr 


Ser 


His 








380 










Ser 


Pro 


Gly 


Cys 


Arg 


Gly 


Ser 


Ser 






395 










400 


Lys 


Ara 


Lvs 


Met 


Asn 


Asn 


Ser 


His 




410 










415 




Arg 


Va 1 


Met 


Gly 


Asp 


He 


Pro 


Val 


425 










430 






Thr 


:< - 


- 


Asp 


Pro 


Ala 


Gly 


Met 










445 








Ser 


A ~L 1 


: J . 1 :> 


: : o r 

* : e t- 


Ala 


Arg 


Asp 


Glu 


Gly 


Val 


He 


Glu 


Phe 


His 


Val 


Val 






475 










480 


Asn 


Lys 


Lys 


He 


Leu 


Met 


Trp 


Leu 




490 










495 




His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 
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Glu 


1 j 


He 


Thr 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


Lys 


Thr 


Leu 


Ala 




515 










520 










525 








Leu 


Ile 


Lvs 


Asp 


Gly 


Arg 


Val 


lie 


Gly 


Gly 


He 


Cys 


Phe 


Arg 


Met 


Phe 




s ^0 

j ju 








535 










540 










Pro 


Ser 


Gin 


Gly 


Phe 


Thr 


Glu 


lie 


Val 


Phe 


Cys 


Ala 


Val 


Thr 


Ser 


Asn 


O H O 








550 










555 










560 




Gin 


Val 


Lys 


Gly 


Tvr 


GiV 


Thr 


His 


Leu 


Met 


Asn 


His 


Leu 


Lys 


Glu 








565 










570 










575 




Tyr 


His 


He 


Lys 


His 


Glu 


lie 


Leu 


Asn 


Phe 


Leu 


Thr 


Tyr 


Ala 


Asp 


Glu 






580 










585 










590 






Tyr 


Ala 


He 


Glv 


Tvr 


Phe 


Lys 


Lys 


Gin 


Gly 


Phe 


Ser 


Lys 


Glu 


lie 


Lys 




595 










600 










605 








He 


Pro 

610 


Lys 


Thr 


Lys 


Tyr 


Val 
615 


Gly 


Tyr 


lie 


Lys 


Asp 
620 


Tyr 


Glu 


Gly 


Ala 


Thr 


Leu 


Met 


Gly 


Cvs 


Glu 


Leu 


Asn 


Pro 


Gin 


lie 


Pro 


Tyr 


Thr 


Glu 


Phe 


625 






630 










635 










640 


OCX. 


Val 


He 


He 


Lys 
645 


LVS 


Gin 


Lvs 


Glu 


lie 
650 


He 


Lys 


Lys 


Leu 


He 
655 


Glu 


Arg 


Lys 


Gin 


Ala 


Gin 


lie 


Arg 


LVS 


Val 


Tyr 


Pro 


Gly 


Leu 


Ser 


Cys 


Phe 




660 










665 










670 






Lys 




Gly 


Val 


Arg 


Gin 


He 


Pro 


lie 


Glu 


Ser 


lie 


Pro 


Gly 


He 


Arg 


675 










680 










685 








Glu 


Thr 


Gly 


Trp 


Lys 


Pro 


Ser 


Gly 


Lys 


Glu 


Lys 


Ser 


Lys 


Glu 


Pro 


Lys 




690 






695 










700 








Val 


Asp 


Pro 


Glu 


His 


Val 


Tvr 


Ser 


Thr 


Leu 


Lys 


Asn 


lie 


Leu 


Gin 


Gin 


705 










710 










715 










720 


Lys 


Asn 


His 


Pro 


Asn 


Ala 


Trp 


Pro 


Phe 


Met 


Glu 


Pro 


Val 


Lys 


Arg 


Thr 








725 










730 










735 




Glu 


Ala 


Pro 


Gly 


Tyr" 


Tyr 


Glu 


Val 


lie 


Arg 


Phe 


Pro 


Met 


Asp 


Leu 


Lys 








740 








745 










750 






Thr 


Met 


Ser 
755 


Glu 


Arg 


Leu 


Arg 


Asn 
760 


Arg 


Tyr 


Tyr 


Val 


Ser 
765 


Lys 


Lys 


Leu 


Phe 


Met 


Ala 


Asp 


Leu 


Gin 


Arg 


Val 


Phe 


Thr 


Asn 


Cys 


Lys 


Glu 


Tyr 


Asn 




770 








775 










780 










Pro 


Pro 


Glu 


Ser 


Glu 


Tyr 


Tyr 


Lys 


Cys 


Ala 


Ser 


lie 


Leu 


Glu 


. Lys 


Phe 


785 










790 










795 










800 


Phe 


Phe 


Ser 


Lys 


He 
805 


Lys 


Glu 


Ala 


Gly 


Leu 
810 


lie 


Asp 


Lys 









(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE:, amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

His Thr Lys Gly Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys 

1 " 5 1C 15 

Lys Gin Leu lie Ala Leu Cys Cys Tyr Hiu A L a Lys His Cys Gin Glu 

20 25 . 30 

Asn Lys Cys Pro Val Pro Phe Cys Leu As:: l '..a Lys His Asn Val Arg 
35 40 45 

Gin' Gin 
50 . 

(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2204 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACCCACTCCC CCCAGAGCCG ACCTGCAGCA AATAATTGTC AGT CTAACAG AATCCTGTCG 60 

GAGTTGTAGC CATGCCCTAG CTGCTCATGT TTCCCACCTG GAGAATGTGT CAGAGGAAGA 12 0 

AATGAACAGA CTCCTGGGAA TAGTATTGGA TGTGGAATAT CTCTTTACCT GTGTCCACAA 180 

GGAAGAAGAT GCAGATACCA AACAAGTTTA TTTCTATCTA TTTAAGCTCT TGAGAAAGTC 2 40 

TATTTTACAA AGAGGAAAAC CTGTGGTTGG AAGGCTCTTT GGAAAAGAAA CCCCCATTTG 300 

AAAAAC CT AG CATT GAAC AG GGTGTGAATA ACTTTGTGCA GTACAAATTT AGTCACCTGC 3 60 

CAGCAAAAAG AAAGGCAAAC CAATAGTTGA GTTGGCAAAA ATGTTCCTAA ACCGCATCAC 42 0 

CTATTGGCAT CTGGAGGCAC CAT CT C AAC G AGACTGCGAT CTCCAATGAT GATATTCTGG 4 80 

ATACAAAGAG AACTACACAA GGTGGCTGTG TTACTGCAAC GTGCCACAGT TCTGCGACAG 540 

TCTACCTCGG TACGAAACCA CACAGGT GTT TGGGAGAACA TCGTTCGCTC GGTCTTCACT 60 0 

GTTAT GAGGC GACAACTCCT G GAACAAGC A AGAC AG GAAA AAGATAAACT GCCTCTTGAA 660 

AAACGAACTC TAATCCTCAC TCATTTCCCA AAATTTCTGT CCATGCTAGA AGAAGAAGTA 72 0 

TATAGTCAAA ACTCTCCCAT CT GGGAT C AC CATTTTCTCT CAGCCTCTTC CAGAACCAGC 7 80 

CAGCTAGGCA TCCAAACAGT TATCAATCAC CTCCTGTGGC TGGGACAATT T C AT AC AAT T 840 

CAACCTCATC TTCCCTTGAG CAGCCAAACG CAGGGAGCAG CAGTCCTGCC TGCAAAGCCT 9 00 

CTTCTGGACT T GAGGC AAAC CCAGGAGAAA AGAG GAAAAT GACTGATTCT CATGTTCTGG 9 60 

AGGAGGCCAA GAAACCCCGA GTTATGGGGG ATATTCCGAT GGAATTAATC AACGAGGTTA 102 0 

TGTCTACCAT CACGGACCCT GCAGCAATGC TTGGACCAGA GACCAATTTT C T GT C AG C AC 10 8 0 

ACTCGGCCAG GGATGAGGCG GCAAGGTTGG AAGAGCGCAG GGGTGTAATT G AAT T T C AC G 114 0 

TGGTTGGCAA TTCCCTCAAC CAGAAACCAA ACAAGAAGAT CCTGATGTGG CTGGTTGGCC 12 00 

T AC AGAAC GT TTTCTCCCAC CAGCTGCCCC GAATGCCAAA AGAAT AC AT C ACACGGCTCG 12 60 

TCTTTGACCC GAAACACAAA ACCCTTGCTT TAATTAAAGA TGGCCGTGTT ATTGGTGGTA 132 0 

TCTGTTTCCG TATGTTCCCA TCTCAAGGAT TCACAGAGAT TGTCTTCTGT GCTGTAACCT 13 8 0 

C AAAT GAGCA AGT CAAGGGC TAT GGAACAC ACCTGATGAA T CATT T GAAA GAAT AT C AC A 14 40 

TAAAGCATGA CATCCTGAAC TTCCTCACAT ATGCAGATGA AT AT G CAATT GGATACTTTA 1500 

AGAAACAGGG TTTCTCCAAA GAAATTAAAA TACCTAAAAC C AAAT AT GTT GG C TAT AT C A 15 60 

AGGATTATGA AGGAGCCACT T TAAT GGGAT GTGAGCTAAA TCCACGGATC C C GT AC AC AG 162 0 

AATTTTCTGT CAT CATT AAA AAGCAGAAGG AGATAATTAA AAAAC T GAT T GAAAGAAAAC 168 0 

AGGCACAAAT TCGAAAAGTT TACCCTGGAC TTTCATGTTT TAAAGATGGA GTTCGACAGA 17 4 0 

TTCCTATAGA AAGCATTCCT GGAATTAGAG AGACAGGCTG GAAACCGAGT GGAAAAGAGA 18 00 

AAA GT AAAGA GCCCAGAGAC CCTGACCAGC TTTACAGCAC GCT CAAGAGC ATCCTCCAGC 18 60 

AGGTGAAGAG C CAT C AAAGC GCTTGGCCCT T CAT G GAAC C T GT GAAGAGA ACAGAAGCTC 192 0 

CAGGATATTA T GAAGT TATA AGGTCCCCCA TGGATCTCAA AAC CAT GAGT GAACGCCTCA 1980 

AG AAT AG GT A CTACGTGTCT AAGAAAT TAT TCATGGCAGA CTTACAGCGA GTCTTTACCA 2 04 0 

ATT GC AAAGA GTACAACGCC CCTGAGAGTG AATACTACAA ATGTGCCAAT ATCCTGGAGA 2100 

AATTCTTCTT CAGTAAAATT AAGGAAGCTG GAT TAAT T G A CAAGTGATTT TTTTTCCCCC 2160 

TCTGCTTCTT AGAAACTCAC C AAGCAGT GT GCCTAAAGCA AGGT 22 04 

(2) INFORMATION FOR SEQ ID NO : 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

GAATTCCGGC GAAACCACTC ATGTCTTTGG GCGAAGCCTT CTCCGGTCCA TTTTCACCGT 60 

TACCCGCCGG CAGCTGCTGG AAAAGTTCCG AGT GGAGAAG G AC AAAT T G G TGCCCGAGAA 12 0 

GAGGACCC.TC ATCCTCACTC ACTTCCCCAA GTAAGGCTCC TTCTGGCCTA CCAGGATTTG 180 

GCCCCAAGTT CACATCCTCC CTGTTGTCCC CTTTTTTCCA GGAAGGCTTC CTGGATTGGT 2 40 

CCCTCCTCTC CCTCCATGGG CCTTTTGGGA TCTGGGCGTC TACCTGGCAG ACTTGCCCAT 3 00 

GGCCCAGAAG CAACTTGCTA GTACTAGTCT GGGGATGGCA GATTCCTGTC CATGCTGGAG 3 60 

GAGGAGATCT ATGGGGCAAA CTCTCCAATC T GGGAGT C AG GCTTCACCAT GCCACCCTCA 42 0 
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GAGGGGACAC AGCTGGTTCC CCGGCCAGCT TCAGTCAGTG CAGCGGTTGT TCCCAGCACC 48 0 

CCCATCTTCA GCCCCAGCAT GGGT GGGGGC AGCAACAGCT CCCTGAGTCT GGATTCTGCA 54 0 

GGGGCCGAGC CTATGCCAGG CGAGAAGAGG ACGCTCCCAG AGAACCTGAC CCTGGAGGAT 6 00 

GCCAAGCGGC TCCGTGTGAT GGGT GACAT C C C CAT GGAGC TGGTCAATGA GGT CAT GCT G 660 
AC CATC ACT G ACCCTGCTGC CATGCTGGGG CCTGAGACGA GCCTGCTTTC GGCCAATGCG 72 0 

GCCCGGGATG AGACAGCCCG CCTGGAGGAG CGCCGCGGCA TCATCGAGTT C CAT GT CAT C 780 
GGCAACTCAC TGACGCCCAA GGCCAACCGG CGGGTGTTGC TGTGGCTCGT GGGGCTGCAG 84 0 

AATGTCTTTT CCCACCAGCT GCCGCGCATG CCTAAGGAGT ATATCGCCCG CCTCGTCTTT 9 00 

GACCCGAAGC ACAAGACTCT GGCCTTGATC AAGGATGGGC GGGT CAT CGG TGGCATCTGC 9 60 

TTCCGCATGT TTCCCACCCA GGGCTTCACG GAGATTGTCT TCTGTGCTGT CACCTCGAAT 102 0 

GAGCAGGTCA AGGGTTATGG GACCCACCTG ATGAACCACC T GAAGGAGT A TCACATCAAG 1080 

C AC AAC AT T C TCTACTTCCT CACCTACGCC GACGAGTACG CCATCGGCTA CTTCAAAAAG 1140 

CAGGGTTTCT C C AAGGAC AT CAAGGTGCCC AAGAGCCGCT ACCTGGGCTA CAT C AAGGAC 12 00 

TACGAGGGAG CGACGCTGAT GGAGT GT GAG CTGAATCCCC GCATCCCCTA C AC GGAGC TG 12 60 

TCCCACATCA TCAAGAAGCA GAAAGAGATC AT CAAGAAGC TGATTGAGCG CAAACAGGCC 132 0 

CAGATCCGCA AGGTCTACCC GGGGCTCAGC TGCTTCAAGG AGGGCGT GAG GCAGATCCCT 13 8 0 

GT GGAGAGC G TTCCTGGCAT T C GAGAGAC A GGCTGGAAGC CATTGGGGAA GGAGAAGGGG 14 4 0 

AAGGAGCTGA AGGACCCCGA CCAGCTCTAC ACAACCCTCA AAAACCTGCT GGCCCAAATC 15 00 

AAGTCTCACC CCAGTGCCTG GCCCTTCATG GAGCCTGTGA AGAAGTCGGA GGCCCCTGAC 15 60 

TACTACGAGG TCATCCGCTT CCCCATTGAC CTGAAGACCA TGACTGAGCG GCTGCGAAGC 162 0 

CGCTACTACG TGACCCGGAA GCTCTTTGTG GCCGACCTGC AGCGGGT CAT CGCCAACTGT 168 0 

CGCGAGTACA ACCCCCCGGA CAGCGAGTAC TGCCGCTGTG CCAGCGCCCT GGAGAAGTTC 17 4 0 

TTCTACTTCA AGCTCAAGGA GGGAGGCCTC AT T GACAAGT AGGCCCATCT TTGGGCCGCA 18 00 

GCCCTGACCT GGAATGTCTC CACCTCGGAT T CT GAT CT GA TCCTTAGGGG GTGCCCTGGC 18 60 

CCCACGGACC CGACTCAGCT T GAGAC ACT C CAGCCAAGGG TCCTCCGGAC CCGATCCTGC 192 0 

AGCTCTTTCT GGACCTTCAG GCACCCCCAA GCGTGCAGCT CTGTCCCAGC CTTCACTGTG 198 0 

TGTGAGAGGT CTCCTGGGTT GGGGCCCAGC CCCTCTAGAG TAGCTGGTGG CCAGGGATGA 2 04 0 

ACCTTGCCCA GCCGTGGTGG CCCCCAGGCC TGGTCCCCAA GAGCCCGGAA TTC 2 093 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9046 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCTTGTTTGT GTGCTAGGCT GGGGGGGAGA GAGGGCGAGA GAGAGCGGGC GAGAGTGGGC 60 

AAGCAGGACG CCGGGCTGAG TGCTAACTGC GGGAC GCAGA GAGTGCGGAG GGGAGTCGGG 12 0 

TCGGAGAGAG GCGGCAGGGG CCAGAACAGT GGCAGGGGGC CCGGGGCGCA CGGGCTGAGG 18 0 

CGACCCCCAG CCCCCTCCCG TCCGCACACA CCCCCACCGC GGTCCAGCAG CCGGGCCGGC 2 40 

GTCGACGCTA GGGGGGACCA T T AC AT AAC C CGCGCCCCGG CCGTCTTCTC CCGCCGCCGC 300 

GGCGCCCGAA CTGAGCCCGG GGCGGGCGCT CCAGCACTGG CCGCCGGCGT GGGGCGTAGC 3 60 

AGCGGCCGTA T T ATT AT TT C GCGGAAAGGA AGGCGAAGGA GGGGAGCGCC GGCGCGAGGA 420 

GGGGCCGCCT GCGCCCGCCG CCGGAGCGGG GCCTCCTCGG TGGGCTCCGC GTCGGCGCGG 480 

GCGTGCGGGC GGCGCTGCTC GGCCCGGCCC CCTCGGCCCT CTGGTCCGGC CAGCTCCGCT 540 

CCCGGCGTCC TTGCCGCGCC TCCGCCGGCC GCCGCGCGAT GTGAGGCGGC GGCGCCAGCC 600 

TGGCTCTCGG CTCGGGCGAG TTCTCTGCGG CCATTAGGGG CCGGTGCGGC GGCGGCGCGG 660 

AGCGCGGCGG CAGGAGGAGG GTTCGGAGGG TGGGGGCGCA GGCCCGGGAG GGGGCACCGG 720 

GAGGAGGT GA GTGTCTCTTG TCGCCTCCTC CTCTCCCCCC TTTTCGCCCC CGCCTCCTTG 7 80 

TGGCGATGAG AAGGAGGAGG ACAGCGCCGA GGAGGAAGAG GTTGATGGCG GCGGC GGAGC 8 40 

T C C GAGAGAC CTCGGCTGGG CAGGGGCCGG CCGTGGCGGG CCGGGGACTG CGCCTCTAGA 900 

GCCGCGAGTT CTCGGGAATT . CGCCGCAGCG GACCGGCCTC GGCGAATTTG TGCTCTTGTG 960 

CCCTCCTCCG GGCTTGGGCC AGGCCGGCCC CTCGCACTTG CCCTTACCTT TTCTATCGAG 1020 

TCCGCATCCC TCTCCAGCCA CTGCGACCCG GCGAAGAGAA AAAGGAACTT CCCCCACCCC 108 0 

CTCGGGTGCC GTC GGAGC CC CCCAGCCCAC CCCTGGGTGC GGCGCGGGGA CCCCGGGCCG 1140 

AAGAAGAGAT TTCCTGAGGA TTCTGGTTTT CCTCGCTTGT ATCTCCGAAA GAATTAAAAA 1200 

TGGCCGAGAA TGTGGTGGAA CCGGGGCCGC CTTCAGC CAA GCGGCCTAAA CTCTCATCTC 12 60 

CGGCCCTCTC GGCGTCCGCC AGCGATGGCA CAGATTTTGG CTCTCTATTT GACTTGGAGC 132 0 

ACGACTTACC AGATGAATTA ATCAACTCTA CAGAATTGGG ACTAACCAAT GGT GGT GAT A 138 0 
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TTAATCAGCT TCAGACAAGT CTTGGCATGG TACAAGATGC AGCTTCTAAA CATAAACAGC 14 4 0 

TGTCAGAATT GCTGCGATCT GGTAGTTCCC CTAACCTCAA TATGGGAGTT GGTGGCCCAG 1500 

GTCAAGTCAT GGCCAGCCAG GCCCAACAGA GCAGTCCTGG ATTAGGTTTG ATAAATAGCA 15 60 

TGGTCAAAAG CCCAATGACA CAGGCAGGCT T&ACTTCTCC CAACATGGGG AT GGGCACTA 162 0 

GTGGACCAAA TCAGGGTCCT AC GCAGT CAA CAGGTATGAT GAACAGTCCA GTAAATCAGC 168 0 

CTGCCATGGG AAT GAAC AC A GGGACGAATG CGGGCATGAA TCCTGGAATG TTGGCTGCAG 17 40 

GCAAT GGAC A AGGGAT AAT G CCTAATCAAG T CAT GAAC GG TTCAATTGGA GCAGGCCGAG 18 00 

GGCGACAGGA TAT GCAGT AC CCAAACCCAG GC AT GGGAAG TGCTGGCAAC TTACTGACTG 18 60 

AGCCTCTTCA GCAGGGCTCT CCCCAGATGG GAGGACAAAC AGGATTGAGA GGCCCCCAGC 192 0 

CTCTTAAGAT GGGAAT GAT G AACAACCCCA ATCCTTATGG T T C AC CAT AT ACTCAGAATC 198 0 

CTGGACAGCA GATT GGAGCC AGTGGCCTTG GTCTCCAGAT TCAGACAAAA ACTGTACTAT 2040 

CAAATAACTT ATCTCCATTT GCT AT GGAC A AAAAGGCAGT TCCTGGTGGA GGAATGCCCA 2100 

ACATGGGTCA ACAGCCAGCC CCGCAGGTCC AGCAGCCAGG TCTGGTGACT CCAGTTGCCC 2160 

AAGGGAT GGG TTCTGGAGCA CATACAGCTG AT C C AGAGAA GCGCAAGCTC AT C C AG C AG C 222 0 

AGCTTGTTCT CCTTTTGCAT GCT CACAAGT GCCAGCGCCG GGAACAGGCC AATGGGGAAG 22 8 0 

TGAGGCAGTG CAACCTTCCC CACTGTCGCA CAAT GAAGAA TGTCCTAAAC CACATGACAC 2 34 0 

ACTGCCAGTC AGGCAAGTCT TGCCAAGTGG CACACTGTGC ATCTTCTCGA C AAAT CAT T T 2 4 00 

CAC ACT GGAA GAATTGTACA AGACATGATT GTCCTGTGTG TCTCCCCCTC AAAAAT GCT G 2 4 60 

GTGATAAGAG AAAT CAAC AG CCAATTTTGA CTGGAGCACC CGTTGGACTT GGAAATCCTA 2 52 0 

GCTCTCTAGG GGTGGGTCAA CAGTCTGCCC C CAAC CTAAG CACTGTTAGT CAGATTGATC 2 58 0 

CCAGCTCCAT AGAAAGAGCC TATGCAGCTC TTGGACTACC CT AT CAAGT A AAT CAGAT GC 2 64 0 

CGACACAACC CCAGGTGCAA GCAAAGAACC AG C AGAAT C A GCAGCCTGGG CAGTCTCCCC 27 00 

AAGGCATGCG GC C CAT GAGC AACATGAGTG CTAGTCCTAT GGGAGTAAAT GGAGGT GTAG 2 7 60 

GAGTTCAAAC GCCGAGTCTT CTTTCTGACT CAAT GT T GC A TTCAGCCATA AATTCTCAAA 2 82 0 

ACCCAATGAT GAGT GAAAAT GCCAGTGTGC CCTCCCTGGG TCCTATGCCA ACAGCAGCTC 2880 

AACCATCCAC TACTGGAATT CGGAAACAGT GGCACGAAGA TAT TACT CAG GATCTTCGAA 2 94 0 

AT CAT C T T GT TCACAAACTC GTCCAAGCCA TATTTCCTAC GCCGGATCCT GCTGCTTTAA 3000 

AAGACAGACG GAT GGAAAAC CTAGTTGCAT ATGCTCGGAA AGTTGAAGGG GACATGTATG 3 060 

AATCTGCAAA CAAT C GAGC G GAAT AC T AC C ACCTTCTAGC TGAGAAAATC TATAAGATCC 312 0 

AGAAAGAACT AGAAGAAAAA C GAAGGAC C A GACTACAGAA GC AGAACAT G C T AC C AAAT G 318 0 

CTGCAGGCAT GGTTCCAGTT T C CAT GAAT C CAGGGCCTAA CAT GG GACAG CCGCAACCAG 32 4 0 

GAAT G AC T T C TAATGGCCCT CTACCTGACC CAAGTATGAT CCGTGGCAGT GTGCCAAACC 33 0 0 

AGATGATGCC T C GAAT AACT CCACAATCTG GT T T GAAT C A ATTTGGCCAG AT GAGC AT GG 33 6 0 

CCCAGCCCCC TAT T GT AC C C CGGCAAACCC CTCCTCTTCA GCACCATGGA CAGTTGGCTC 3 42 0 

AACCTGGAGC TCTCAACCCG CCTATGGGCT ATGGGCCTCG TAT GCAAC AG CCTTCCAACC 34 8 0 

AGGGCCAGTT CCTTCCTCAG ACTCAGTTCC CAT C AC AGGG AAT GAAT GT A AC AAAT AT C C 354 0 

CTTTGGCTCC GTCCAGCGGT CAAGCT C CAG TGTCTCAAGC AC AAAT GT CT AGTTCTTCCT 3 600 

GCCCGGTGAA CTCTCCTATA ATGCCTCCAG GGTCTCAGGG GAGC CAC AT T CACTGTCCCC 3 660 

AGCTTCCTCA ACCAGCTCTT CATCAGAATT CACCCTCGCC TGTACCTAGT CGTACCCCCA 372 0 

CCCCTCACCA TACTCCCCCA AGCATAGGGG CTCAGCAGCC ACCAGCAACA AC AAT T C CAG 37 8 0 

CCCCTGTTCC TACACCACCA GCCATGCCAC CTGGGCCACA GTCCCAGGCT CTACATCCCC 38 4 0 

CTCCAAGGCA GACACCTACA CCACCAACAA CACAACTTCC CCAACAAGTG CAGCCTTCAC 3 900 

TTCCTGCTGC ACCTTCTGCT GACCAGCCCC AGCAGCAGCC TCGCTCACAG CAGAGCACAG 3 9 60 

CAGCGTCTGT TCCTACCCCA AACGCACCGC TGCTTCCTCC GCAGCCTGCA ACTCCACTTT 4 02 0 

CCCAGCCAGC T GTAAGC ATT GAAGGACAGG TAT C AAAT C C TCCATCTACT AGTAGCACAG 4 08 0 

AAGTGAATTC TCAGGCCATT GCTGAGAAGC AGCCTTCCCA GGAAGT GAAG AT G GAGGC C A 4140 

AAAT GGAAGT GGAT CAACCA GAAC C AGC AG ATACGCAGCC GGAGGAT AT T TCAGAGTCTA 4200 

AAGT GGAAGA CTGTAAAATG GAATCTACCG AAACAGAAGA GAGAAGCACT GAGTTAAAAA 42 60 

CT GAAAT AAA AGAG GAGGAA GACCAGCCAA GTACTTCAGC TACCCAGTCA TCTCCGGCTC 4 32 0 

CAGGACAGTC AAAGAAAAAG ATTTTCAAAC CAGAAGAACT ACGACAGGCA CTGATGCCAA 438 0 

CATTGGAGGC ACTTTACCGT CAGGATCCAG AATCCCTTCC CTTTCGTCAA CCTGTGGACC 4440 

CTCAGCTTTT AGGAATCCCT GATTACTTTG AT ATT GT GAA GAGCCCCATG GATCTTTCTA 4500 

C CAT T AAGAG GAAGTTAGAC ACTGGACAGT AT CAG GAG C C CT GGCAGTAT GT C GAT GAT A 4 5 60 

TTTGGCTTAT GT T CAAT AAT GCCTGGTTAT ATAACCGGAA AAC AT CAC GG GTATACAAAT 4 62 0 

ACTGCTCCAA GCTCTCTGAG GTCTTTGAAC AAGAAAT T G A CCCAGTGATG CAAAGCCTTG .4680 

GAT AC T GT T G TGGCAGAAAG TTGGAGTTCT CTCCACAGAC ACTGTGTTGC T AC GGCAAAC 47 4 0 

AGTTGTGCAC AATACCTCGT GATGCCACTT ATTACAGTTA CCAGAACAGG TATCATTTCT 4 8 00 

GT GAGAAGT G T T T CAAT GAG AT C C AAGGGG AGAG C GT T T C TTTGGGGGAT GACCCTTCCC 4 8 60 

AGCCTCAAAC TACAATAAAT AAAGAACAAT TTTCCAAGAG AAAAAATGAC AC AC T GGAT C 4 92 0 

CTGAACTGTT T GT T GAAT GT AC AGAGT GC G GAAGAAAGAT GC AT CAGAT C TGTGTCCTTC 498 0 

ACCATGAGAT CATCTGGCCT GCTGGATTCG T CT GT GAT GG CT GTTTAAAG AAAAGT G CAC 5040 
GAACTAGGAA AGAAAATAAG TTTTCTGCTA AAAGGTTGCC AT C T AC C AGA CTTGGCACCT . 5100 

TT CT AGAGAA T C GT GT GAAT GACTTTCTGA GGCGACAGAA TCACCCTGAG TCAGGAGAGG 5160 



WO 98/03652 



PCT/US97/12877 



92 

TCACTGTTAG AGT AGTT CAT GCTTCTGACA AAACCGTGGA AGTAAAACCA GGC AT GAAAG 522 0 

CAAGGTTTGT GGACAGT GGA GAGAT GGCAG AATCCTTTCC AT AC C GAAC C AAAGCCCTCT 52 8 0 

TTGCCTTTGA AGAAATTGAT GGTGTTGACC TGTGCTTCTT T GGC AT GC AT GTTCAAGAGT 5 3 40 

ATGGCTCTGA CTGCCCTCCA CCCAACCAGA GGAGAGTATA CATATCTTAC C T C GAT AGT G 5400 

TTCATTTCTT CCGTCCTAAA TGCTTGAGGA CTGCAGTCTA T CAT GAAAT C CTAATT GGAT 54 60 

ATTTAGAATA TGTCAAGAAA TTAGGTTACA CAACAGGGCA TATTTGGGCA TGTCCACCAA 5520 

GTGAGGGAGA TGATTATATC TTCCATTGCC ATCCTCCTGA C C AGAAGAT A CCCAAGCCCA 5580 

AGCGACTGCA GGAAT GGT AC AAAAAAATGC TTGACAAGGC TGTATCAGAG CGTATTGTCC 5 640 

AT GACT AC AA GGATATTTTT AAACAAGCTA CT GAAGAT AG ATTAACAAGT GCAAAGGAAT 57 00 

TGCCTTATTT C GAGGGT GAT TTCTGGCCCA AT GT T CT GGA AGAAAGCATT AAGGAACTGG 57 60 

AACAGGAGGA AGAAGAGAGA AAAC GAGAGG AAAAC AC C AG CAAT GAAAGC ACAGATGTGA 5 82 0 

CCAAGGGAGA CAGCAAAAAT GCTAAAAAGA AGAATAATAA GAAAACCAGC AAAAATAAGA 58 8 0 

GCAGCCTGAG TAGGGGCAAC AAGAAGAAAC CCGGGATGCC CAAT GT AT CT AACGACCTCT 5 9 40 

CACAGAAACT AT AT GC C AC C AT GGAGAAGC ATAAAGAGGT CTTCTTTGTG ATCCGCCTCA 6000 

TTGCTGGCCC TGCTGCCAAC TCCCTGCCTC CCATTGTTGA TCCTGATCCT CTCATCCCCT 6060 

GC GAT CT GAT GGAT GGT CGG GATGCGTTTC TCACGCTGGC AAGGGACAAG CACCTGGAGT 612 0 

TCTCTTCACT CCGAAGAGCC C AGT GGT C C A C CAT GT GC AT GCTGGTGGAG CTGCACACGC 6180 

AGAGCCAGGA CCGCTTTGTC T AC AC CT GC A AT GAAT GCAA GC AC CAT GT G GAGACACGCT 62 40 

GGCACT GTAC T GT CT GT GAG GATTATGACT TGTGTATCAC C T GC T AT AAC ACTAAAAACC 6300 

ATGACCACAA AAT GGAGAAA CTAGGCCTTG GCTTAGATGA TGAGAGCAAC AACCAGCAGG 6360 

CTGCAGCCAC CCAGAGCCCA GGC GATTCTC GCCGCCTGAG TATCCAGCGC T GC AT C C AGT 642 0 

CTCTGGTCCA TGCTTGCCAG T GT C GGAAT G CCAATTGCTC ACTGCCATCC TGCCAGAAGA 64 8 0 

TGAAGCGGGT T GT GCAGCAT ACCAAGGGTT GCAAACGGAA AAC CAAT GGC GGGTGCCCCA 6540 

TCTGCAAGCA GCTCATTGCC CTCTGCTGCT AC CAT G C C AA GC ACT GC C AG GAGAACAAAT 6600 

GCCCGGTGCC GTTCTGCCTA AACATCAAGC AGAAGCTCCG GCAGCAACAG CTGCAGCACC 6660 

GACTACAGCA GGCCCAAATG CTTCGCAGGA GGAT GGC CAG CAT GCAGC GG ACTGGTGTGG 672 0 

TTGGGCAGCA ACAGGGCCTC CCTTCCCCCA. CTCCTGCCAC TCCAACGACA CCAACTGGCC 67 8 0 

AACAGCCAAC CACCCCGCAG ACGCCCCAGC CCACTTCTCA GCCTCAGCCT ACCCCTCCCA 68 40 

ATAGCATGCC ACCCTACTTG CCCAGGACTC AAGCTGCTGG CCCTGTGTCC CAGGGTAAGG 6 9 00 

CAGCAGGCCA GGTGACCCCT CCAACCCCTC CTCAGACTGC TCAGCCACCC CTTCCAGGGC 6960 

CCCCACCTAC AGCAGTGGAA AT GGC AAT GC AGATTCAGAG AGCAGC GGAG ACGCAGCGCC 7 020 

AGATGGCCCA CGTGCAAATT TTTCAAAGGC CAATCCAACA CCAGATGCCC C C GAT GACT C 7 08 0 

CCATGGCCCC CAT GGGT AT G AACCCACCTC C CAT GAC CAG AGGTCCCAGT GGGCATTTGG 7140 

AGC CAGGGAT GGGAC C GAC A GGGAT GCAGC AACAGCCACC CTGGAGCCAA GGAGGAT T GC 72 00 

CTCAGCCCCA GCAACTACAG TCTGGGATGC CAAGGCCAGC CAT GAT GT C A GTGGCCCAGC 72 60 

AT GGT CAAC C T T T GAAC AT G GCTCCACAAC CAGGATTGGG CCAGGTAGGT ATCAGCCCAC 7 32 0 

T C AAAC CAGG CACTGTGTCT CAACAAGCCT T AC AAAAC CT TTTGCGGACT CTCAGGTCTC 7 38 0 

CCAGCTCTCC CCT GCAGC AG CAACAGGTGC TTAGTATCCT TCACGCCAAC CCCCAGCTGT 7 4 40 

TGGCTGCATT CAT C AAGC AG CGGGCTGCCA AGT AT GC C AA CTCTAATCCA CAACCCATCC 7 500 

CTGGGCAGCC TGGCATGCCC CAGGGGCAGC CAGGGCTACA GCCACCTACC ATGCCAGGTC 7 560 

AGCAGGGGGT CCACTCCAAT CCAGCCATGC AGAAC AT GAA T C CAAT GC AG GCGGGCGTTC 7 62 0 
AGAGGGCTGG CCTGCCCCAG CAGCAACCAC AG C AGC AAC T CCAGCCACCC ATGGGAGGGA . 7 68 0 

TGAGCCCCCA GGCTCAGCAG AT GAAC AT GA ACCACAACAC CATGCCTTCA CAATTCCGAG 7 7 40 

ACATCTTGAG AC GAC AGC AA AT GAT G CAAC AGCAGCAGCA AC AGGG AG C A GGGCCAGGAA 7 800 

TAGGCCCTGG. AAT GGC CAAC CAT AAC C AGT TCCAGCAACC CCAAGGAGTT GGCTACCCAC 7 8 60 

CACAGCCGCA GCAGC GGAT G CAGCATCACA TGCAACAGAT GCAACAAGGA AAT AT GG GAC 7 92 0 

AGATAGGCCA GCTTCCCCAG GCCTTGGGAG CAGAGGCAGG TGCCAGTCTA CAGGCCTATC 7 98 0 

AGCAGCGACT CCTTCAGCAA CAGAT GGGGT CCCCTGTTCA GCCCAACCCC ATGAGCCCCC 8 04 0 

AGCAGCATAT GCTCCCAAAT CAGGCCCAGT CCCCACACCT AC AAGGC CAG CAGATCCCTA 8100 

ATTCTCTCTC CAATCAAGTG CGCTCTCCCC AGCCTGTCCC TTCTCCACGG CCACAGTCCC 8160 

AGCCCCCCCA CTCCAGTCCT TCCCCAAGGA TGCAGCCTCA GCCTTCTCCA CACCACGTTT 822 0 

CCCCACAGAC AAGTTCCCCA CAT C CT GGAC TGGTAGCTGC CCAGGCCAAC CCCATGGAAC 8 2 80 

AAGGGCATTT TGCCAGCCCG GAC CAGAAT T CAATGCTTTC TCAGCTTGCT AGCAAT C CAG 8340 
GCAT GGCAAA CCTCCATGGT GCAAGCGCCA CGGACCTGGG ACTCAGCACC GATAACT CAG . 8 4 00 

ACTTGAATTC AAACCTCTCA CAGAGTACAC TAGACATACA CTAGAGACAC CTTGTATTTT 8 4 60 

GGGAGCAAAA AAATTATTTT CTCTTAACAA GACTTTTTGT AC T GAAAAC A ATTTTTTTGA 8 520 

ATCTTTCGTA GC CT AAAAGA CAATTTTCCT TGGAACACAT AAGAACT GT G C AGT AGC C GT 8580 

TTGTGGTTTA AAGCAAACAT GCAAGAT GAA C CT GAGGGAT GATAGAATAC AAAGAATATA 8 64 0 

TTTTTGTTAT GGGCTGGTTA CCACCAGCCT TTCTTCCCCT TTGTGTGTGT GGTTCAAGTG 87 00 

TGCACTGGGA GGAG GCT GAG GCCTGTGAAG C C AAAC AAT A TGCTCCTGCC TTGCACCTCC 87 60 

AATAGGTTTT ATTATTTTTT TTAAATTAAT GAAC AT AT GT AAT AT T AAT G AAC AT AT GT A 8 820 

ATATTAATAG TTATTATTTA CT GGTGCAGA T GGT T GAC AT TTTTCCCTAT TTTCCTCACT 88 8 0 

T T AT GGAAGA GT T AAAAC AT TTCTAAACCA GAGGACAAAA GGGGTT AAT G T TACT TT GAA 894 0 
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AT T AC AT T CT ATATATATAT AAATATATAT AAATATATAT T AAAATAC C A GTTTTTTTTC 9000 
TCTGGGTGCA AAGATGTTCA TTCTTTTAAA AAATGTTTAA AAAAAA 904 6 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7326 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

AT GGC C GAGA ACTTGCTGGA CGGACCGCCC AACCCCAAAC GAGCCAAACT CAGCTCGCCC 60 
GGCTTCTCCG C GAAT GACAA CACAGATTTT G GAT C ATT GT TTGACTTGGA AAATGACCTT 12 0 

C CT GAT GAG C TGATCCCCAA T GGAGAAT T A AGCCTTTTAA ACAGTGGGAA CCTTGTTCCA 180 
GATGCTGCGT CCAAACATAA ACAACTGTCA GAGCTTCTTA GAGGAGGCAG CGGCTCTAGC 2 40 

AT CAAC C C AG GGATAGGCAA TGTGAGTGCC AGCAGCCCTG TGCAACAGGG CCTTGGTGGC 300 
CAGGCTCAGG GGCAGCCGAA CAGTACAAAC ATGGCCAGCT TAGGTGCCAT GGGCAAGAGC 3 60 

CCTCTGAACC AAGGAGACTC ATCAACACCC AACCTGCCCA AACAGGCAGC CAGCACCTCT 42 0 

GGGCCCACTC CCCCTGCCTC CCAAGCACTG AATCCACAAG CACAAAAGCA AGTAGGGCTG 480 
GTGACCAGTA GTCCTGCCAC AT C AC AGAC T GGACCTGGGA TCTGCATGAA TGCTAACTTC 5 40 

AACCAGACCC ACCCAGGCCT TCTCAATAGT AACTCTGGCC ATAGCTTAAT GAATCAGGCT 600 
CAACAAGGGC AAGCTCAAGT CAT GAAT GGA TCTCTTGGGG CTGCTGGAAG AGGAAGGGGA 660 
GCTGGAATGC CCTACCCTGC TCCAGCCATG CAGGGGGC C A CAAGCAGTGT GCTGGCGGAG 72 0 

ACCTTGACAC AGGTTTCCCC ACAAATGGCT GGCCATGCTG GACTAAATAC AGCACAGGCA 780 
GGAGGC AT GA CCAAGATGGG AATGACTGGT ACCACAAGTC CATTTGGACA ACCCTTTAGT 8 40 

CAAACTGGAG GGCAGCAGAT GGGAGCCACT GGAGTGAACC CCCAGTTAGC CAGCAAACAG 9 00 

AGCATGGTCA ATAGTTTACC TGCTTTTCCT ACAGAT AT C A AGAATACTTC AGTCACCACT 9'60 

GT GCCAAATA TGTCCCAGTT GCAAAC AT C A GTGGGAATTG TACCCACACA AGCAATTGCA 1020 

ACAGGCCCCA CAGCAGACCC T GAAAAAC GC AAACT GAT AC AGCAGCAGCT GGTTCTACTG 1080 

CTTCATGCCC ACAAATGTCA GAGAC GAGAG CAAGCAAATG GAGAGGTTCG NGCCTGTTCT 1140 

CTCCCACACT GT C GAAC CAT GAAAAAC GT T T T GAAT C AC A TGACACATTG TCAGGCTCCC 12 00 

AAAGCCTGCC AAGTTGCCCA TTGTGCATCT T C AC GAC AAA T CAT CT CT C A TTGGAAGAAC 12 60 

TGCACACGAC ATGACTGTCC TGTTTGCCTC CCTTTGAAAA ATGCCAGTGA CAAGCGAAAC 132 0 

CAACAAACCA TCCTGGGATC TCCAGCTAGT GGAATTCAAA ACACAATTGG TTCTGTTGGT 1380 

GCAGGGCAAC AGAAT GC C AC TTCCTTAAGT AACCCAAATC CCATAGACCC CAGTTCCATG 14'4 0 

CAGCGGGCCT ATGCTGCTCT AGGACTCCCC T AC AT GAAC C AGCCTCAGAC GCAGCTGCAG 15 00 

CCTCAGGTTC CTGGCCAGCA AC C AGCACAG CCTCCAGCCC ACCAGCAGAT GAGGACTCTC 15 60 

AATGCCCTAG GAAACAACCC CAT GAGT GT C CCAGCAGGAG GAATAACAAC AGAT CAACAG 1620 

CCACCAAACT TGATTTCAGA ATCAGCTCTT CCAACTTCCT TGGGGGCTAC CAATCCACTG 168 0 

AT GAAT GAT G GTTCAAACTC TGGTAACATT GGAAGCCTCA GC AC GAT AC C TACAGCAGCG 1740 

CCTCCTTCCA GCACTGGTGT TCGAAAAGGC T GGCAT GAAC AT GT GAC T C A GGACCTACGG 18 0 0 

AGT CAT CT AG TCCATAAACT CGTTCAAGCC ATCTTCCCAA CTCCAGACCC TGCAGCTCTG 18 60 

AAAGATCGCC GCAT GGAGAA CCTGGTTGCC TATGCTAAGA AAGT GGAGGG AGAC AT GT AT 1920 

GAGTCTGCTA ATAGCAGGGA T GAAT ACT AT CATTTATTAG CAGAGAAAAT CTATAAAATA 198 0 

CAAAAAGAAC TAGAAGAAAA GCGGAGGACA CGTTTACATA AGCAAGGCAT CCTGGGTAAC 2 04 0 

CAGCCAGCTT TACCAGCTTC TGGGGCTCAG CCCCCTGTGA TTCCACCAGC CCAGTCTGTA 2100 

AGAC CT C C AA ATGGGCCCCT GCCTTTGCCA GT GAAT C G C A TGCAGGTTTC TCAAGGGATG 2160 

AAT T CAT T T A ACCCAATGTC CCTGGGAAAC GTCCAGTTGC CACAGGCACC CAT GGGAC CT 222 0 

CGTGCAGCCT CCCCTATGAA CCACTCTGTG CAGATGAACA GCAT GGC CTC AGTTCCGGGT 22 8 0 

AT GGC C ATT T CTCCTTCACG GAT GC CTC AG CCTCCAAATA T GAT GGGCAC T CAT GC CAAC 2 340 

AACATTATGG CCCAGGCACC TACTCAGAAC CAGTTTCTGC CACAGAACCA GTTTCCATCA 2 4 00 

TCCAGTGGGG C AAT GAGT GT GAAC AGT GT G GGCATGGGGC AAC CAGCAGC CCAGGCAGGT 2 4 60 

GT T T C AC AG G GTCAGGAACC TGGAGCTGCT CTCCCTAACC C T C T GAAC AT GCTGGCACCC 2520 

CAGGCCAGCC AGCTGCCTTG CCCACCAGTG ACACAGTCAC CATTGCACCC GACTCCACCT 2 58 0 

CCTGCTTCCA CAGCTGCTGG CATGCCCTCT CTCCAACATC CAACGGCACC AGGAAT GAC C 2640 

CCTCCTCAGC CAGCAGCTCC CACTCAGCCA TCTACTCCTG TGTCATCTGG GCAGACT C CT 27 00 

ACCCCAACTC CTGGCTCAGT GCCCAGCGCT GCCCAAACAC AGAGTACCCC T AC AGT C CAG 2 7 60 

G CAGCAGC AC AGGCTCAGGT GAC T C C AC AG CCTCAGACCC CAGTGCAGCC ACCATCTGTG 2820 

GCTACTCCTC AGT CAT CAC A G CAG CAAC C A ACGCCTGTGC ATACTCAGCC AC C T GGC AC A 28 80 

CCGCTTTCTC AGGCAGCAGC CAGCATTGAT AATAGAGTCC CTACTCCCTC CAC T GT GAC C 2940 
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AGTGCTGAAA CCAGTTCCCA GCAGCCAGGA CCCGATGTGC CCATGCTGGA AAT GAAGAC A 3000 

GAGGTGCAGA CAGATGATGC TGAGCCTGAA CCTACTGAAT CCAAGGGGGA ACCTCGGTCT 3060 

GAGATGATGG AAGAGGATTT AC AAGGTT CT TCCCAAGTAA AAGAAGAGAC AGATACGACA 3120 

GAGCAGAAGT CAGAGCCAAT GGAAGTAGAA GAAAAGAAAC C T GAAGT AAA AGT GGAAGCT 3180 

AAAGAGGAAG' AAGAGAACAG TTCGAACGAC ACAGCCTCAC AATCAACATC TCCTTCCCAG 32 4 0 

CCACGCAAAA AAATCTTTAA ACCCGAGGAG CTACGCCAGG CACTTATGCC AACTCTAGAA 33 00. 

GCACTCTATC GACAGGACCC AGAGTCTTTG CCTTTTCGTC AGCCTGTAGA TCCTCAGCTC 33 60 

CTAGGAATCC CAGATTATTT T GAT AT AGT G AAGAATCCTA TGGACCTTTC T AC CAT C AAA 342 0 

CGAAAGCTGG ACACAGGGCA AT AT C AAGAA CCCTGGCAGT ATGTGGATGA TGTCAGGCTT 348 0 

AT GTT CAACA ATGCGTGGCT AT AT AAT C GT AAAACGTCCC GTGTATATAA ATTTTGCAGT 354 0 

AAACTTGCAG AGGTCTTTGA ACAAGAAATT GACCCTGTCA TGCAGTCTCT TGGATATTGC 3600 

TGTGGACGAA AGT AT GAGT T CTCCCCACAG ACTTTGTGCT GTTACGGAAA GCAGCTGTGT 3660 

ACAATTCCTC GTGATGCAGC CTACTACAGC TATCAGAATA GGTATCATTT CTGTGGGAAG 3720 

T GTTT CACAG AGATCCAGGG CGAGAATGTG ACCCTGGGTG ACGACCCTTC CCAACCTCAG 37 8 0 

ACGACAATTT CCAAGGATCA AT T T GAAAAG AAGAAAAATG ATACCTTAGA TCCTGAACCT 38 4 0 

TTTGTTGACT GCAAAGAGTG TGGCCGGAAG ATGCATCAGA TTTGTGTTCT ACACT AT GAC 3900 

ATCATTTGGC CTTCAGGTTT T GT GT GT GAC AACTGTTTGA AGAAAACTGG CAGACCTCGG 3960 

AAAGAAAACA AATTCAGTGC TAAGAGGCTG CAGACCACAC GAT T GGGAAA CCACTTAGAA 4020 

GACAGAGTGA ATAAGTTTTT GCGGCGCCAG AATCACCCTG AAGCTGGGGA GGTTTTTGTC 408 0 

AGAGTGGTGG CCAGCTCAGA CAAGACTGTG GAGGT CAAGC CGGGAATGAA GTCAAGGTTT. 414 0 

GTGGATTCTG GAGAGATGTC GGAATCTTTC CCATATCGTA CCAAAGCACT CTTTGCTTTT 4200 

GAGGAGATCG ATGGAGTCGA TGTGTGCTTT TTTGGGATGC ATGTGCAAGA TACGGCTCTG 42 60 

ATTGCCCCCC ACCAAATACA AGGCT GT GT A T AC AT AT C T T ATCTGGACAG TATTCATTTC 432 0 

TTCCGGCCCC GCTGCCTCCG GAC AG C T GT T TACCATGAGA TCCTCATCGG ATATCTCGAG 4 380 

TAT GT GAAGA AATTGGTGTA TGTGACAGCA CATATTTGGG CCTGTCCCCC AAGT GAAGGA 4440 

GATGACTATA TCTTTCATTG CCACCCCCCT GAC C AGAAAA TCCCCAAACC AAAACGACTA 4 500 

CAGGAGTGGT ACAAGAAGAT GCTGGACAAG GCGTTTGCAG AGAGGATCAT T AAC GAC TAT 4560 

AAGGAC AT CT TCAAACAAGC GAAC GAAGAC AGGCT C AC GA GTGCCAAGGA GTTGCCCTAT 462 0 

TTT GAAGGAG ATTTCTGGCC T AAT GT GT T G GAAGAAAGCA TTAAGGAACT AGAACAAGAA 468 0 

GAAGAAGAAA GGAAAAAAGA AGAGAGTACT GCAGCGAGTG AGACTCCTGA GGGCAGT C AG 47 40 

GGTGACAGCA AAAATGCGAA GAAAAAGAAC AACAAGAAGA CCAACAAAAA CAAAAGCAGC 4 800 

ATTAGCCGCG C C AAC AAGAA GAAGCCCAGC AT GC C C AAT G TTTCCAACGA CCTGTCGCAG 4 8 60 

AAGCTGTATG C C AC CAT GGA GAAGCACAAG GAGGT ATT C T TTGTGATTCA TCTGCATGCT 4 92 0 

GGGCCTGTTA TCAGCACTCA GCCCCCCATC GTGGACCCTG ATCCTCTGCT TAGCTGTGAC 4980 

CT CAT GGAT G GGCGAGATGC CTTCCTCACC CTGGCCAGAG ACAAGCACTG GGAATTCTCT 5040 

TCCTTACGCC GCTCCAAATG GTCCACTCTG TGCATGCTGG TGGAGCTGCA CACACAGGGC 5100 

CAGGACCGCT TT GTTT AT AC C T GC AAT GAG TGCAAACACC AT GT GGAAAC ACGCTGGCAC 5160 

T GC ACT GT GT GT GAGGACT A TGACCTTTGT ATCAATTGCT ACAACACAAA GAGCCACACC 522 0 

CATAAGATGG TGAAGTGGGG GCTAGGCCTA GATGATGAGG GCAGCAGTCA GGGTGAGCCA 52 8 0 

CAGTCCAAGA GCCCCCAGGA ATCCCGGCGT CTCAGCATCC AGCGCTGCAT CCAGTCCCTG 5340 

GTGCATGCCT GCCAGTGTCG C AAT G C C AAC TGCTCACTGC CGTCTTGCCA GAAGAT GAAG 54 00 

C GAGT C GT GC AGCACACCAA GGGCT GCAAG CGCAAGACTA AT G GAGGAT G CCCAGTGTGC 5460 

AAGCAGCTCA TTGCTCTTTG CTGCTACCAC GCCAAACACT GC C AAGAAAA TAAATGCCCT 5520 

GTGCCCTTCT GCCTCAACAT CAAACATAAC GTCCGCCAGC AG C AGAT CCA GCACTGCCTG 5580 

CAGCAGGCTC AGCTCATGCG CCGGCGAATG G C AAC CAT GA ACACCCGCAA TGTGCCTCAG 5640 

CAGAGTTTGC CTTCTCCTAC CTCAGCACCA CCCGGGACTC CTACACAGCA GCCCAGCACA 57 00 

CCCCAAACAC CACAGCCCCC AGCCCAGCCT CAGCCTTCAC CT GT T AAC AT GTCACCAGCA 57 60 

GGCTTCCCTA ATGTAGCCCG GACTCAGCCC C C AAC AAT AG TGTCTGCTGG GAAGCCTACC 582 0 

AACCAGGTGC CAGCTCCCCC ACCCCCTGCC CAGCCCCCAC CTGCAGCAGT AG AAGC AG C C 588 0 

CGGCAAATTG AAC GT GAGGC CCAGCAGCAG CAGCACCTAT AC C GAGCAAA CAT C AAC AAT 5940 

GGCATGCCCC CAGGACGTGA CGGTATGGGG ACCCCAGGAA GCCAAATGAC TCCTGTGGGC 6000 

CTGAATGTGC CCCGTCCCAA C C AAGT C AGT GGGCCTGTCA TGTCTAGTAT GCCACCTGGG 6060 

CAGTGGCAGC AGGCACCCAT CCCTCAGCAG CAGCCGATGC CAGGCATGCC CAGGCCTGTA 6120 

ATGTCCATGC AGGCCCAGGC AGCAGT GGCT GGGCCACGGA TGCCCAATGT GCAGCCAAAC 618 0 

AGGAGC AT CT CGCCAAGTGC CCTGCAAGAC CTGCTACGGA CCCTAAAGTC ACCCAGCTCT 62 40 

CCTCAGCAGC AGCAGCAGGT GCTGAACATC CTTAAATCAA ACCCACAGCT AAT GGCAGCT 6300 

TTCATCAAAC AGCGCACAGC CAAGT AT GT G GCCAATCAGC CTGGCATGCA GCCCCAGCCC 6360 

GGACTTCAAT CCCAGCCTGG TATGCAGCCC CAGCCTGGCA TGCACCAGCA GCCTAGTTTG 642 0 

CAAAACCTGA AC GCAAT GC A AGCTGGTGTG CCACGGCCTG GTGTGCCTCC ACCACAACCA 64 80 

GCAAT GG GAG GCCTGAATCC CCAGGGACAA GCTCTGAACA T CAT GAAC CC AGGACACAAC 6540 

CCCAACATGA CAAACATGAA T C C AC AGT AC CGAGAAATGG TGAGGAGACA GCT GCT AC AG 6600 

CACCAGCAGC AGCAGCAGCA ACAGCAGCAG CAGCAGCAGC AACAACAAAA TAGTGCCAGC 6660 

TTGGCCGGGG GCATGGCGGG ACACAGC C AG TTCCAGCAGC CAC AAGGAC C T GGAGGT TAT 672 0 
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GCCCCAGCCA TGCAGCAGCA ACGCAT GCAA CAGCACCTCC CCATCCAGGG CAGCTCCATG 67 8 0 

GGCCAGATGG CTGCTCCAAT GGGACAACTT GGCCAGATGG GGCAGCCTGG GCTAGGGGCA 68 4 0 

GACAGCACCC CT AAT AT C C A GCAGGCCCTG CAGCAACGGA TTCTGCAGCA GCAGCAGATG 690 0 

AAGCAACAAA TTGGGTCACC AGGCCAGCCG AACCCCATGA GCCCCCAGCA GCACATGCTC 6960 

TCAGGACAGC CACAGGCCTC ACATCTCCCT GGCCAGCAGA TCGCCACATC CCTTAGTAAC 7 02 0 

CAGGT GCGAT CTCCAGCCCC TGTGCAGTCT CCACGGCCCC AATCCCAACC TCCACATTCC 7 08 0 

AGCCCGTCAC C AC GGAT AC A ACCCCAGCCT TCACCACACC ATGTTTCACC CCAGACTGGA 714 0 

ACCCCTCACC CTGGACTCGC AGT C AC CAT G GCCAGCTCCA TGGATCAGGG ACACCT GGGG 72 00 

AACCCTGAAC AGAGT GCAAT GCTCCCCCAG CTGAATACCC CCAACAGGAG CGCACTGTCC 7 2 60 

AGTGAACTGT" CCCTGGTTGG T GAT AC C AC G GGAGACACAC TAGAAAAGTT TGTGGAGGGT 7 32 0 

TTGTAG 7 32 6 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2499 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TCACTTGTCA ATT AAT C C AG CTTCCTTAAT TTTACTGAAG AAGAATTTCT CCAGGATATT 60 

GGCACATTTG TAGTATTCAC TCTCAGGGGC GTTGTACTCT TTGCAATTGG TAAAGACTCG 12 0 

CTGTAAGTCT GC CAT GAATA ATTTCTTAGA CACGTAGTAC CTATTCTTGA GGCGTTCACT 18 0 

CATGGTTTTG AGAT C CAT GG GGGAC CTT AT AACTTCATAA TATCCTGGAG CTTCTGTTCT 240 

CTTCACAGGT T C CAT GAAGG GCCAAGCGCT TTGATGGCTC TTCACCTGCT. GGAGGATGCT 3 00 

CTTGAGCGTG CTGTAAAGCT GGTCAGGGTC TCTGGGCTCT TTACTTTTCT CTTTTCCACT 360 

-CGGTTTCCAG CCTGTCTCTC TAATTCCAGG AATGCTTTCT AT AGGAAT C T GTCGAACTCC 42 0 

ATCTTTAAAA CAT GAAAGT C CAGGGTAAAC TTTTCGAATT TGTGCCTGTT TTCTTTCAAT 48 0 

CAGTTTTTTA ATTATCTCCT TCTGCTTTTT AAT GAT GAC A GAAAATTCTG TGTACGGGAT 540 

CCGTGGATTT AGCTCACATC C CAT T AAAGT GGCTCCTTCA TAATCCTTGA TATAGCCAAC 60 0 

AT AT T T GGTT TTAGGTATTT TAATTTCTTT GGAGAAACCC TGTTTCTTAA AGT AT C C AAT 660 

TGCATATTCA T C T GC AT AT G TGAGGAAGTT CAGGATGTCA TGCTTTATGT GAT AT T CT T T 72 0 

CAAATGATTC AT CAGGT GT G TTCCATAGCC CTTGACTTGC T CAT T T GAGG TTACAGCACA 780 

GAAGAC AAT C TCTGTGAATC CTT GAGAT GG GAACATACGG AAACAGATAC CACCAATAAC 840 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GCCGTGTGAT 900 

GTATTCTTTT GGCATTCGGG GCAGCTGGTG GGAGAAAACG TTCTGTAGGC CAACCAGCCA 960 

CAT CAGGAT C TTCTTGTTTG GTTTCTGGTT GAGGGAATTG CCAACCACGT GAAATTCAAT 102 0 

TACACCCCTG CGCTCTTCCA ACCTTGCCGC CTCATCCCTG GCCGAGTGTG CTGACAGAAA 108 0 

ATTGGTCTCT GGTCCAAGCA TTGCTGCAGG GTCCGTGATG GTAGACATAA CCTCGTTGAT 114 0 

TAATT C CAT C GGAATATCCC CCATAACTCG GGGTTTCTTG GCCTCCTCCA GAAC AT GAGA 12 0 0 

AT C AGT C ATT TTCCTCTTTT CTCCTGGGTT TGCCTCAAGT CCAGAAGAGG CTTTGCAGGC 12 60 

AGGACTGCTG CTCCCTGCGT TTGGCTGCTC AAGGGAAGAT GAGGTTGAAT T GT AT GAAAT 132 0 

TGTCCCAGCC ACAGGAGGTG GATTGATAAC TGTTTGGATG CCTAGCTGGC TGGTTCTGGA 138 0 

AGAGGCTGAG AGAAAAT C CT GATCCCAGAT GGGAGAGTTT T GACT AT ATA CTTCTTCTTC 14 4 0 

TAGCATGGAC AGAAATTTTG GGAAAT GAGT GAGGATTAGA GTTCGTTTTT CAAGAGGCAG 150 0 

TTTATCTTTT TCCTGTCTTG CTTGTTCCAG GAGTTGTCGC C T C AT AAC AG T GAAGAC C GA 15 60 

GC GAAGCAAT GTTCTCCCAA AC AC C T GT GT GGTTTCGTAC CGAGGTAGAC TGTCGCAGAA. 162 0 

CTGTGGCACG TTGCAGTAAC ACAGCCACCT TGTGTAGTTC TCTTTGTATC C AGAAAT AT C 168 0 

AT CAT T GGGA GAT C GC AGT C TTCGTTGAGA TGGTGCCTCC AGAT G C C AAT AGTTGATGCG 1740 

GTTTAGGAAC ATTTTTGCCA ACTCAACTAT TGTTTGCCTT TCTTTTGCTG G CAGGT GACT 1800 

AAATTTGTAC T GC AC AAAGT TATTCACACC CTGTTCAATG CTAGGTTTTT CAAAT GGGGG 18 60 

TTTCTTTTCC AAAGAGCCTT CAACCACAGG TTTTCCTCTT T GT AAAAT AG ACTTTCTCAA 192 0 

GAGCTTAAAT AGAT AGAAAT AAACTTGTTT GGTATCTGCA TCTTCTTCCT T GT GGACAC A 198 0 

GGTAAAGAGA TATTCCACAT CC AAT ACT AT TCCCAGGAGT CTGTTCATTT CTTCCTCTGA 2 04 0 

CACATTCTCC AGGTGGGAAA CAT GAGC AGC T AGGGC AT GG CTACAACTCC GACAGGATTC 2100 

TGTTAGACTG ACAATTATTT GCT GCAGGT C GGCTCTGGGG GGAGTGGGTG AGGGGTTAGG 2160 

GTTTTTCCAG C CAT T ACATT TACAAGACTC CTCGGCCTTG CAGGCGGAGT ACACTCCGAG 222 0 

TTTCTCCAGT TTCTTGGCCC GCGGAGCGGA GCGTAGTTGC GCTTTCTTCA CGGCGATTCG 22 8 0 

GGCCGAGCCA CCGCCTCCCG GTCCTTCGGC CGTGCCCGCT GCAGCCACTG CCGTCGCCGG 2 340 

ACCGCAGGCG CCCGAGCCCC CGGCGGCAGC GGCGCAGGGG GAGCCCTGCG GGGGCGCGGG 2 4 00 
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CGGAAGCGCC GCAGGCTGCG GGGGCAGCGC CCCGGGCCCG GCCCCTGCCC CGGCTCCTGC 2 4 60 
CCCGCAGCCG CCCGGCCCGG CCCCGCCAGC CTCGGACAT 2 4 99 

(2) I N FORMAT I ON FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

T C ACT T GT C A ATCAACCCTG CTTCCTTAAT TTTACTGAAG AAGAACTTCT CCAGGATGCT 60 

GGCGCATTTG TAGTACTCGC TCTCGGGAGG GTTGTACTCC TTGCAGTTGG TGAACACTCG 12 0 

TTGCAAGTCC G C CAT GAAT A ACTTCTTAGA C AC AT AGT AC CTGTTCCTGA GGCGTTCACT 18 0 

CATGGTTTTC AGAT C CAT GG GGAACCTTAT AACTTCATAA TATCCCGGAG CTTCTGTTCT 2 40 

CTTCACTGGT T C CAT GAAAG GCCAAGCATT TGGATGGTTC TTCACCTGCT G C AG GAT GT T 300 

CTTGAGGGTG CTGTAAACGT GCTCAGGGTC TTTGGGCTCT TTACTTTTCT CTTTTCCACT 3 60 

TGGTTTCCAG CCTGTCTCTC TGATTCCAGG AATGCTTTCT ATAGGAATCT GCCGAACTCC 42 0 

ATCTTTGAAA C AC GAAAGT C CAGGGTAGAC TTTTCGAATC TGGGCTTGTT TTCTTTCTAT 48 0 

CAGCTTTTTA ATGATCTCCT TCTGCTTTTT AATGATGACA GAGAACTCTG TGTATGGGAT 54 0 

CTGAGGGTTC AGCTCACATC C CAT C AAAGT GGCCCCTTCA TAATCCTTGA TGTAGCCAAC 600 

ATATTTGGTT TTAGGTATTT TGATTTCTTT GGAGAAACCC TGCTTCTTGA AATAGCCGAT 660 

GGCAT ACT C A TCTGCATATG TGAGGAAGTT GAGGATCTCG TGCTTTATGT GGTATTCTTT 72 0 

GAGATGGTTC ATCAGGTGGG TTCCATAGCC CTTGACTTGT TCATTTGAGG TTACTGCACA 7 80 

GAAAACAATC T CT GT GAAT C CCTGGGAT GG AAACATCCGG AAACAGATAC CACCAATGAC 84 0 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GCCGTGTGAT 900 

GTACTCTTTG GGCAT TCTGG GCAGCT GGT G GGAAAACACA TTCTGGAGGC CCACGAGCCA 960 

CAT C AGGAT C TTCTTGTTTG GTTTCTGGTT CAGGGAGTTG CCCACCACGT GGAATTCAAT 102 0 

GACACCCCTG CGTTCTTCCA GCCGTGCCGC CTCATCTCTG GCCGAATGGG CTGACAGAAA 1080 

ATTGGTCTCT GGTCCAAGCA TCCCTGCAGG GTCTGTGATG GTAGACATGA CCTCATTGAT 114 0 

CAATTCCACG GGAATATCCC CCATCACTCG AGATCTCTTG GCCTCCTCGG GAGCATGAGA 12 00 

GTTGTTCATT TTCCTCTTTT CTCCCGGGTT TGCTTCAAGC CCAGAAGAGC CTCTGCATCC 12 60 

AGGACTTGTT CTCCCTCCAT TGATCTGCTC ATGGGAAGTT GAATTTGAAC TGAACAATGC 132 0 

T GT C C C AGT A ACAGGAGGAC T GAT TACT GT TTGGATTCCT AGCGGGCTGG TTCTGGAAGA 138 0 

GGCT GAGAGA AAATCCTGAT CCCAGATAGG AGAATTTTGA CTATACACTT CTTCTTCCAA 144 0 

CATGGACAGA AACTTTGGGA AATGTGTGAG GATAAGC GT G CGTTTCTCAA GAGGCAGTTT 1500 

GTCTTTTTTC TGTCTGGCTT GTTCCAAGAG CTGTCGTCTC AT GAT G GT GA AGACCGAGCG 1560 

AAGCAATGTT CTCCCAAACA CCTTTGTGGT TTCGTACCGA GGTAAGCT GT C AC AGAAC T G 162 0 

C GGT AC AT T G CAGTAGCACA ACCACCTTGT GTAGTTTTCC TTGTATCCAG AGAT GT CAT C 168 0 

ATTGGGAGAC CGTAGTCTCC GCTGAGATGG AGCCTCCAGA T GC C AGT AGT T GAT GC GGT T 17 4 0 

C AGAAAC AT C TTGGCCAGCT CGATCGTTGT CTGCCTCTCT TTCGATGGCA AGT GACT AAA 1800 

CTTGTACTGC AC GAAGTT GT TCACACCCTG TTCAATACTG GGCTTCTCAA ATGGCGGCTT 18 60 

CTTCTCCAAG GAGCCTTCAA C C AC AG GT TT TCCTCTTTGT AAAATTGACT TTCTCAAGAG 192 0 

CT T GAAT AGG TAGAAGTACA CTTGTTTGGT ATCTGCATCT TCTTCTTTGT GGACGCAGGT 198 0 

GAAGAGGT A C TCCACATCCA ACACAATTCC CAGGAGTCTG TCCATCTCTT CCTCTGACAC 2 04 0 

ATTCTCCAAG T GAGAAAC GT GAGCAGCAAG GGCAT GGCT A CAGCTTCGAC AGGATTCTGT 2100 
CAAACTGACA ATTATCTGCT GGAGGTCTCC TCTTGGTGGA GTAGGAGAGG GGTTAGGGTT . 2160 

CTTCCAGCCA TTGCATTTAC AGGACTCCTC TGCCTTGCAG GCGGAGTACA CGCCGAGTTT 222 0 

CTCCAGCTTC TTCGCCCGCG GAGCAGAGCG CAACTGCGCC TTCTTCACGG CGATCCGGGC 228 0 

CGAGCCGCCT CCTCCCGGTC CCTCGGCGGT GCCCGCCGCG GCCACCGGCG TCGCTGGCCC 2 34 0 

GCAGGAAGCA GAGCTCCCGG CAGCGGTGGC CAGGGTCCGG GGGGAACCGT GCGGGGGCGC 2 400 

GGGAGGCAGT GCTGGGGACC CGGCCCCGCC AGCCTCGGCC AT 2 442 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCCGCCAGCC TCGGACATGC 2 0 

(2) I N FORMAT I ON FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CCCGCCAGCC TCGGCCATGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



ATGGCCGAGG CTGGCGGGGC CGGGTCCCCA GCACTGCCTC CCGCGCCCCC GCACGGTTCC 6 0 

CCCCGGACCC TGGCCACCGC TGCCGGGAGC TCTGCTTCCT GCGGGCCAGC GACGCCGGTG 12 0 

GCCGCGGCGG GCACCGCCGA GGGACCGGGA GGAGGCGGCT CGGCCCGGAT CGCCGTGAAG 18 0 

AAGGCGCAGT TGCGCTCTGC TCCGCGGGCG AAGAAGCTGG AGAAACT C GG CGTGTACTCC 24 0 

GCCTGCAAGG CAGAGGAGTC CTGTAAATGC AATGGCTGGA AGAACCCTAA CCCCTCTCCT 3 00 

ACTCCACCAA GAGGAGACCT CCAGCAGATA ATTGTCAGTT TGACAGAATC CTGTCGAAGC 3 60 

TGTAGCCATG CCCTTGCTGC TCACGTTTCT CACTTGGAGA ATGTGTCAGA GGAAGAGATG 42 0 

GACAGACTCC TGGGAATTGT GTTGGATGTG GAGTACCTCT TCACCTGCGT . CCACAAAGAA 4 80 

GAAGAT G C AG ATACCAAACA AGTGTACTTC T AC C TAT T C A AGCTCTTGAG AAAGTCAATT 54 0 

TTACAAAGAG GAAAACCTGT GGTTGAAGGC TCCTTGGAGA AGAAGCCGCC ATTTGAGAAG 600 

CCCAGTATTG AACAGGGTGT GAACAACTTC GTGCAGTACA AGTTTAGTCA CTTGCCATCG 6 60 

AAAGAGAGGC AGACAACGAT CGAGCTGGCC AAG AT GT T T C T GAAC C G CAT CAACTACTGG 72 0 

CATCTGGAGG CTCCATCTCA GCGGAGACTA CGGTCTCCCA AT GAT GAC AT CTCTGGATAC - 7 80 

AAGGAAAACT AC AC AAG GT G GTTGTGCTAC TGCAATGTAC CGCAGTTCTG TGACAGCTTA 840 

CCTCGGTACG AAAC CAC AAA GGTGTTTGGG AGAACATTGC TTCGCTCGGT CTTCACCATC 90 0 

AT GAGAC GAC AGCTCTTGGA ACAAGCCAGA CAGAAAAAAG ACAAACTGCC TCTTGAGAAA 9 60 

CGCACGCTTA TCCTCACACA TTTCCCAAAG TTTCTGTCCA TGTTGGAAGA AGAAGT GT AT 102 0 

AGTCAAAATT CTCCTATCTG GGAT CAGGAT TTTCTCTCAG CCTCTTCCAG AACCAGCCCG 108 0 

CTAGGAATCC AAACAGTAAT CAGTCCTCCT GTT ACT GGGA CAGCATTGTT CAGTTCAAAT 1140. 

TCAACTTCCC AT GAGC AGAT CAAT GGAGGG AGAACAAGTC C T G GAT GC AG AGGCTCTTCT 12 0 0 

GGGCTT GAAG CAAACCCGGG AGAAAAGAGG AAAAT GAAC A AC T C T CAT GC TCCCGAGGAG 12 60 

GC C AAGAGAT CTCGAGTGAT GGGGGATATT CCCGTGGAAT T GAT CAAT GA GGTCATGTCT 132 0 

AC CAT CAC AG ACCCTGCAGG GATGCTTGGA CCAGAGACCA ATTTTCTGTC AGCCCATTCG 13 8 0 

GCCAGAGATG AGGCGGCACG GCTGGAAGAA CGCAC-V". iTO TCATTGAATT CCACGTGGTG 14 4 0 

GGCAACTCCC TGAACCAGAA AC C AAAC AAG AAGATC :TGA TGTGGCTCGT GGGCCTCCAG 1500 

AATGTGTTTT CCCACCAGCT GCCCAGAATG CCCAAA-JA^i' A C AT CAC AC G GCTCGTCTTT 1560 

GACCCGAAAC ACAAAACCCT TGCTTTAATT AAAGA'rc-.'JCC GTGTCATTGG TGGTATCTGT 162 0 

TT C C GGAT GT TTCCATCCCA GGGATTCACA GAGATTGTTT TCTGTGCAGT AAC C T C AAAT 168 0 

GAACAAGTCA AGGGCTATGG AACCCACCTG AT GAAC CAT C T C AAAGAAT A C C AC AT AAAG 17 4 0 

CACGAGATCC TCAACTTCCT CAC AT AT G C A GAT GAGT AT G C CAT C GGC T A TTTCAAGAAG .18 00 

CAGGGTTTCT C C AAAG AAAT C AAAAT AC CT AAAACCAAAT ATGTTGGCTA CATCAAGGAT 18 60 

TATGAAGGGG CCACTTTGAT GGGAT GT GAG CTGAACCCTC AGATCCCATA CACAGAGTTC 192 0 

TCTGTCATCA T T AAAAAGC A GAAGGAGATC ATTAAAAAGC T GAT AGAAAG AAAACAAGCC 19 8 0 

C AGAT T C G AA AAGTCTACCC TGGACTTTCG TGTTTCAAAG ATGGAGTTCG GCAGATTCCT 2 04 0 
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ATAGAAAGCA TTCCTGGAAT CAGAGAGACA GGCTGGAAAC CAAGTGGAAA AGAGAAAAGT 2100 

AAAGAGC C C A AAGACCCTGA GC AC GT T T AC AGCACCCTCA AGAACATCCT GCAGCAGGTG 2160 

AAGAACCATC CAAATGCTTG GCCTTTCATG GAACCAGTGA AGAGAACAGA AGCTCCGGGA 222 0 

TAT TAT GAAG TTATAAGGTT C C C CAT GGAT CTGAAAACCA TGAGTGAACG CCTCAGGAAC 228 0 

AGGTACTATG TGTCTAAGAA GT TAT T CAT G GCGGACTTGC AACGAGTGTT CACCAACTGC 2 340 

AAGGAGTACA ACCCTCCCGA GAGC GAGT AC TACAAATGCG CCAGCATCCT GGAGAAGT T C 2 400 

TTCTTCAGTA AAATTAAGGA AGCAGGGTTG ATTGACAAGT GA 2 442 
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What is claimed is: 

1 . A purified protein designated P/CAF having a molecular weight of about 93,000 
daltons as determined by sodium dodecyl sulfate polyacrylamide gel electrophoresis 
under reducing conditions and which acetylates histones. 

2. The protein of claim 1 consisting of the amino acid sequence of SEQ ID NO: 1 . 

3. The protein of claim 1 comprising the amino acid sequence of SEQ ID NO: 2. 

4. The protein of claim 1, which also binds to the amino acid sequence of SEQ ED 
NO:3 on a p300 cellular protein and to amino acid residues 1805-1854 of a CBP cellular 
protein (SEQ IDNO:9). 

5. A fragment of the protein of claim 1 having histone acetyltransferase activity. 

6. A polypeptide consisting of the amino acid sequence of SEQ ID NO: 2. 

7. A fragment of the protein of claim 1 which binds to the amino acid sequence of 
SEQ ID NO: 3 on the p300 cellular protein and the amino acid sequence of SEQ ID 
NO: 9 on the CBP cellular protein. 

8. A polypeptide consisting of the amino acid sequence of SEQ ID NO:4. 

9. A nucleic acid consisting of the nucleotide sequence of SEQ ID NO: 10. 

10. A nucleic acid having a nucleotide sequence which encodes the protein of claim 
1. 
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11. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

2. 

12. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

3. 

13. A nucleic acid consisting of the nucleotide sequence which encodes the protein 
of claim 4. 

14. A nucleic acid complementary to and which selectively hybridizes with the 
nucleic acid of claim 1 1 under stringent hybridization conditions. 

15. A fragment of the nucleic acid of claim 9, which encodes a polypeptide that 
acetylates histones. 

16. A fragment of the nucleic acid of claim 9, which encodes a polypeptide which 
binds to the amino acid sequence of SEQ ID NO: 3 on the p300 cellular protein and the 
amino acid sequence of SEQ ID NO: 9 on the CBP cellular protein. 

17. A purified antibody which specifically binds the protein of claim 1 . 

18. A purified antibody which specifically binds the protein of claim 2. 

19. A purified antibody which specifically binds the protein of claim 3. 

20. A purified antibody which specifically binds the protein of claim 4. 

. 21 . An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of P/C AF comprising; 

a) contacting the substance with a system in which histone acetylation by 
P/CAF can be determined; 
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b) determining the amount of histone acetylation by P/CAF in the 
presence of the substance; and 

c) comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of P/CAF. 

22. An assay for screening substances for the ability to inhibit binding of P/CAF to 
p300/CBP comprising: 

a) contacting the substance with a system in which the P/CAF binding of 
P300/CBP can be determined; 

b) determining the amount of P/CAF binding of p300/CBP in the presence of 
the substance; and 

c) comparing the amount of binding of P/CAF to p300/CBP in the presence of 
the substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 

23. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the p300 protein comprising amino acid residues 
1767-1816 (SEQ ID NO:3) and the protein of claim 4. 

24. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising amino acid residues 
1805-1854 (SEQ ID NO:9) and the protein of claim 4. 

25. The method of claim 22, wherein the system consists of a cell extract produced 
from cells producing both p300 and P/CAF. 
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26. An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of p300/CBP comprising: 

a) contacting the substance with a system in which histone acetylation by 
p300/CBP can be determined; 

b) determining the amount of histone acetylation by p300/CBP in the 
presence of the substance; and 

c) comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

27. An assay for screening substances for the ability to inhibit binding of a DNA- 
binding transcription factor to p300/CBP comprising: 

a) contacting the substance with a system in which the DNA-binding 
transcription factor binding of P300/CBP can be determined, 

b) determining the amount of DNA-binding transcription factor binding of 
p300/CBP in the presence of the substance; and 

c) comparing the amount of binding of DNA-binding transcription factor to 
p300/CBP in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 
binding of DNA-binding transcription factor to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

28. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a DNA-binding transcription factor and p300/CBP. 

29. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising a DNA-binding 
transcription factor and p300/CBP. 
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30. The method of claim 27, wherein the system consists of a cell extract produced 
from cells producing both a DNA-binding transcription factor and p300/CBP. 

3 1 . The method of claim 27, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YY1, Sap- la, c-Fos, MyoD and SRC-1. 

32. A method for inhibiting the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 

33. The method of claim 32, wherein the substance can inhibit the transcription 
modulating activity of P/CAF by preventing the binding of P/CAF to p300/CBP. 

34. A method for stimulating the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
stimulating amount of a substance in a pharmaceutical^ acceptable carrier. 

35. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by promoting the binding of P/CAF to p300/CBP. 

36. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by stimulating the histone acetlytransferase activity of 
P/CAF. 

37. A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 
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38. The method of claim 37, wherein the substance can inhibit the transcription 
modulating activity of p300/CBP by preventing the binding of a DNA-binding 
transcription factor to p300/CBP. 

39. The method of claim 38, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YY1, Sap- la, c-Fos, MyoD and SRC-l. 

40. The method of claim 37, wherein the substance is an antibody which binds 
p300/CBP. 

41 . A method for stimulating the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
stimulating amount of a substance in a pharmaceutically acceptable carrier. 

42. The method of claim 41, wherein the substance can stimulate the histone 
acetyltransferase activity of p300/CBP by promoting the binding of a DNA-binding 
transcription factor to p300/CBP. 
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